From: zooko Date: Wed, 18 Apr 2007 16:19:00 +0000 (+0530) Subject: pyfec: rename pyfec to zfec X-Git-Url: https://git.rkrishnan.org/pf/content/en/frontends/somewhere?a=commitdiff_plain;h=863c01ad4907ad5de8318df33499f709b30ff40c;p=tahoe-lafs%2Fzfec.git pyfec: rename pyfec to zfec It turns out that "pyfec" turns off people who aren't Python hackers. "allmyfec" is too long for a low-level core utility library. "fec" is too generic (it is already used by Luigi Rizzo's library which this library is based on, for one thing). I'm open to other naming suggestions, especially before we widely announce this library, which I expect will happen within a few days. darcs-hash:2c05ab5eaee961453b5ff41587cc72d4fd7bb51d --- diff --git a/pyfec/COPYING b/pyfec/COPYING deleted file mode 100644 index eb42bd5..0000000 --- a/pyfec/COPYING +++ /dev/null @@ -1,346 +0,0 @@ -In addition to the terms of the GNU General Public License, the pyfec package -also comes with a special added permission that you may delay for up to 12 -months the fulfillment of your obligation under GPL section 2.b to release any -derived work under this licence. - - - - GNU GENERAL PUBLIC LICENSE - Version 2, June 1991 - - Copyright (C) 1989, 1991 Free Software Foundation, Inc., - 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA - Everyone is permitted to copy and distribute verbatim copies - of this license document, but changing it is not allowed. - - Preamble - - The licenses for most software are designed to take away your -freedom to share and change it. By contrast, the GNU General Public -License is intended to guarantee your freedom to share and change free -software--to make sure the software is free for all its users. This -General Public License applies to most of the Free Software -Foundation's software and to any other program whose authors commit to -using it. (Some other Free Software Foundation software is covered by -the GNU Lesser General Public License instead.) You can apply it to -your programs, too. - - When we speak of free software, we are referring to freedom, not -price. Our General Public Licenses are designed to make sure that you -have the freedom to distribute copies of free software (and charge for -this service if you wish), that you receive source code or can get it -if you want it, that you can change the software or use pieces of it -in new free programs; and that you know you can do these things. - - To protect your rights, we need to make restrictions that forbid -anyone to deny you these rights or to ask you to surrender the rights. -These restrictions translate to certain responsibilities for you if you -distribute copies of the software, or if you modify it. - - For example, if you distribute copies of such a program, whether -gratis or for a fee, you must give the recipients all the rights that -you have. You must make sure that they, too, receive or can get the -source code. And you must show them these terms so they know their -rights. - - We protect your rights with two steps: (1) copyright the software, and -(2) offer you this license which gives you legal permission to copy, -distribute and/or modify the software. - - Also, for each author's protection and ours, we want to make certain -that everyone understands that there is no warranty for this free -software. If the software is modified by someone else and passed on, we -want its recipients to know that what they have is not the original, so -that any problems introduced by others will not reflect on the original -authors' reputations. - - Finally, any free program is threatened constantly by software -patents. We wish to avoid the danger that redistributors of a free -program will individually obtain patent licenses, in effect making the -program proprietary. To prevent this, we have made it clear that any -patent must be licensed for everyone's free use or not licensed at all. - - The precise terms and conditions for copying, distribution and -modification follow. - - GNU GENERAL PUBLIC LICENSE - TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION - - 0. This License applies to any program or other work which contains -a notice placed by the copyright holder saying it may be distributed -under the terms of this General Public License. The "Program", below, -refers to any such program or work, and a "work based on the Program" -means either the Program or any derivative work under copyright law: -that is to say, a work containing the Program or a portion of it, -either verbatim or with modifications and/or translated into another -language. (Hereinafter, translation is included without limitation in -the term "modification".) Each licensee is addressed as "you". - -Activities other than copying, distribution and modification are not -covered by this License; they are outside its scope. The act of -running the Program is not restricted, and the output from the Program -is covered only if its contents constitute a work based on the -Program (independent of having been made by running the Program). -Whether that is true depends on what the Program does. - - 1. You may copy and distribute verbatim copies of the Program's -source code as you receive it, in any medium, provided that you -conspicuously and appropriately publish on each copy an appropriate -copyright notice and disclaimer of warranty; keep intact all the -notices that refer to this License and to the absence of any warranty; -and give any other recipients of the Program a copy of this License -along with the Program. - -You may charge a fee for the physical act of transferring a copy, and -you may at your option offer warranty protection in exchange for a fee. - - 2. You may modify your copy or copies of the Program or any portion -of it, thus forming a work based on the Program, and copy and -distribute such modifications or work under the terms of Section 1 -above, provided that you also meet all of these conditions: - - a) You must cause the modified files to carry prominent notices - stating that you changed the files and the date of any change. - - b) You must cause any work that you distribute or publish, that in - whole or in part contains or is derived from the Program or any - part thereof, to be licensed as a whole at no charge to all third - parties under the terms of this License. - - c) If the modified program normally reads commands interactively - when run, you must cause it, when started running for such - interactive use in the most ordinary way, to print or display an - announcement including an appropriate copyright notice and a - notice that there is no warranty (or else, saying that you provide - a warranty) and that users may redistribute the program under - these conditions, and telling the user how to view a copy of this - License. (Exception: if the Program itself is interactive but - does not normally print such an announcement, your work based on - the Program is not required to print an announcement.) - -These requirements apply to the modified work as a whole. If -identifiable sections of that work are not derived from the Program, -and can be reasonably considered independent and separate works in -themselves, then this License, and its terms, do not apply to those -sections when you distribute them as separate works. But when you -distribute the same sections as part of a whole which is a work based -on the Program, the distribution of the whole must be on the terms of -this License, whose permissions for other licensees extend to the -entire whole, and thus to each and every part regardless of who wrote it. - -Thus, it is not the intent of this section to claim rights or contest -your rights to work written entirely by you; rather, the intent is to -exercise the right to control the distribution of derivative or -collective works based on the Program. - -In addition, mere aggregation of another work not based on the Program -with the Program (or with a work based on the Program) on a volume of -a storage or distribution medium does not bring the other work under -the scope of this License. - - 3. You may copy and distribute the Program (or a work based on it, -under Section 2) in object code or executable form under the terms of -Sections 1 and 2 above provided that you also do one of the following: - - a) Accompany it with the complete corresponding machine-readable - source code, which must be distributed under the terms of Sections - 1 and 2 above on a medium customarily used for software interchange; or, - - b) Accompany it with a written offer, valid for at least three - years, to give any third party, for a charge no more than your - cost of physically performing source distribution, a complete - machine-readable copy of the corresponding source code, to be - distributed under the terms of Sections 1 and 2 above on a medium - customarily used for software interchange; or, - - c) Accompany it with the information you received as to the offer - to distribute corresponding source code. (This alternative is - allowed only for noncommercial distribution and only if you - received the program in object code or executable form with such - an offer, in accord with Subsection b above.) - -The source code for a work means the preferred form of the work for -making modifications to it. For an executable work, complete source -code means all the source code for all modules it contains, plus any -associated interface definition files, plus the scripts used to -control compilation and installation of the executable. However, as a -special exception, the source code distributed need not include -anything that is normally distributed (in either source or binary -form) with the major components (compiler, kernel, and so on) of the -operating system on which the executable runs, unless that component -itself accompanies the executable. - -If distribution of executable or object code is made by offering -access to copy from a designated place, then offering equivalent -access to copy the source code from the same place counts as -distribution of the source code, even though third parties are not -compelled to copy the source along with the object code. - - 4. You may not copy, modify, sublicense, or distribute the Program -except as expressly provided under this License. Any attempt -otherwise to copy, modify, sublicense or distribute the Program is -void, and will automatically terminate your rights under this License. -However, parties who have received copies, or rights, from you under -this License will not have their licenses terminated so long as such -parties remain in full compliance. - - 5. You are not required to accept this License, since you have not -signed it. However, nothing else grants you permission to modify or -distribute the Program or its derivative works. These actions are -prohibited by law if you do not accept this License. Therefore, by -modifying or distributing the Program (or any work based on the -Program), you indicate your acceptance of this License to do so, and -all its terms and conditions for copying, distributing or modifying -the Program or works based on it. - - 6. Each time you redistribute the Program (or any work based on the -Program), the recipient automatically receives a license from the -original licensor to copy, distribute or modify the Program subject to -these terms and conditions. You may not impose any further -restrictions on the recipients' exercise of the rights granted herein. -You are not responsible for enforcing compliance by third parties to -this License. - - 7. If, as a consequence of a court judgment or allegation of patent -infringement or for any other reason (not limited to patent issues), -conditions are imposed on you (whether by court order, agreement or -otherwise) that contradict the conditions of this License, they do not -excuse you from the conditions of this License. If you cannot -distribute so as to satisfy simultaneously your obligations under this -License and any other pertinent obligations, then as a consequence you -may not distribute the Program at all. For example, if a patent -license would not permit royalty-free redistribution of the Program by -all those who receive copies directly or indirectly through you, then -the only way you could satisfy both it and this License would be to -refrain entirely from distribution of the Program. - -If any portion of this section is held invalid or unenforceable under -any particular circumstance, the balance of the section is intended to -apply and the section as a whole is intended to apply in other -circumstances. - -It is not the purpose of this section to induce you to infringe any -patents or other property right claims or to contest validity of any -such claims; this section has the sole purpose of protecting the -integrity of the free software distribution system, which is -implemented by public license practices. Many people have made -generous contributions to the wide range of software distributed -through that system in reliance on consistent application of that -system; it is up to the author/donor to decide if he or she is willing -to distribute software through any other system and a licensee cannot -impose that choice. - -This section is intended to make thoroughly clear what is believed to -be a consequence of the rest of this License. - - 8. If the distribution and/or use of the Program is restricted in -certain countries either by patents or by copyrighted interfaces, the -original copyright holder who places the Program under this License -may add an explicit geographical distribution limitation excluding -those countries, so that distribution is permitted only in or among -countries not thus excluded. In such case, this License incorporates -the limitation as if written in the body of this License. - - 9. The Free Software Foundation may publish revised and/or new versions -of the General Public License from time to time. Such new versions will -be similar in spirit to the present version, but may differ in detail to -address new problems or concerns. - -Each version is given a distinguishing version number. If the Program -specifies a version number of this License which applies to it and "any -later version", you have the option of following the terms and conditions -either of that version or of any later version published by the Free -Software Foundation. If the Program does not specify a version number of -this License, you may choose any version ever published by the Free Software -Foundation. - - 10. If you wish to incorporate parts of the Program into other free -programs whose distribution conditions are different, write to the author -to ask for permission. For software which is copyrighted by the Free -Software Foundation, write to the Free Software Foundation; we sometimes -make exceptions for this. Our decision will be guided by the two goals -of preserving the free status of all derivatives of our free software and -of promoting the sharing and reuse of software generally. - - NO WARRANTY - - 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY -FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN -OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES -PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED -OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF -MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS -TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE -PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, -REPAIR OR CORRECTION. - - 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING -WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR -REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, -INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING -OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED -TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY -YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER -PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE -POSSIBILITY OF SUCH DAMAGES. - - END OF TERMS AND CONDITIONS - - How to Apply These Terms to Your New Programs - - If you develop a new program, and you want it to be of the greatest -possible use to the public, the best way to achieve this is to make it -free software which everyone can redistribute and change under these terms. - - To do so, attach the following notices to the program. It is safest -to attach them to the start of each source file to most effectively -convey the exclusion of warranty; and each file should have at least -the "copyright" line and a pointer to where the full notice is found. - - - Copyright (C) - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License along - with this program; if not, write to the Free Software Foundation, Inc., - 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. - -Also add information on how to contact you by electronic and paper mail. - -If the program is interactive, make it output a short notice like this -when it starts in an interactive mode: - - Gnomovision version 69, Copyright (C) year name of author - Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. - This is free software, and you are welcome to redistribute it - under certain conditions; type `show c' for details. - -The hypothetical commands `show w' and `show c' should show the appropriate -parts of the General Public License. Of course, the commands you use may -be called something other than `show w' and `show c'; they could even be -mouse-clicks or menu items--whatever suits your program. - -You should also get your employer (if you work as a programmer) or your -school, if any, to sign a "copyright disclaimer" for the program, if -necessary. Here is a sample; alter the names: - - Yoyodyne, Inc., hereby disclaims all copyright interest in the program - `Gnomovision' (which makes passes at compilers) written by James Hacker. - - , 1 April 1989 - Ty Coon, President of Vice - -This General Public License does not permit incorporating your program into -proprietary programs. If your program is a subroutine library, you may -consider it more useful to permit linking proprietary applications with the -library. If this is what you want to do, use the GNU Lesser General -Public License instead of this License. diff --git a/pyfec/README.txt b/pyfec/README.txt deleted file mode 100644 index 237b006..0000000 --- a/pyfec/README.txt +++ /dev/null @@ -1,215 +0,0 @@ - * Intro and Licence - -This package implements an "erasure code", or "forward error correction code". -It is offered under the GNU General Public License v2 or (at your option) any -later version. This package also comes with the added permission that, in the -case that you are obligated to release a derived work under this licence (as -per section 2.b of the GPL), you may delay the fulfillment of this obligation -for up to 12 months. - -The most widely known example of an erasure code is the RAID-5 algorithm which -makes it so that in the event of the loss of any one hard drive, the stored -data can be completely recovered. The algorithm in the pyfec package has a -similar effect, but instead of recovering from the loss of only a single -element, it can be parameterized to choose in advance the number of elements -whose loss it can tolerate. - -This package is largely based on the old "fec" library by Luigi Rizzo et al., -which is a mature and optimized implementation of erasure coding. The pyfec -package makes several changes from the original "fec" package, including -addition of the Python API, refactoring of the C API to support zero-copy -operation, a few clean-ups and micro-optimizations of the core code itself, -and the addition of a command-line tool named "fec". - - - * Community - -The source is currently available via darcs on the web with the command: - -darcs get http://www.allmydata.com/source/pyfec - -More information on darcs is available at http://darcs.net - -Please join the pyfec mailing list and submit patches: - - - - - * Overview - -This package performs two operations, encoding and decoding. Encoding takes -some input data and expands its size by producing extra "check blocks", also -called "secondary blocks". Decoding takes some data -- any combination of -blocks of the original data (called "primary blocks") and "secondary blocks", -and produces the original data. - -The encoding is parameterized by two integers, k and m. m is the total number -of blocks produced, and k is how many of those blocks are necessary to -reconstruct the original data. m is required to be at least 1 and at most 256, -and k is required to be at least 1 and at most m. - -(Note that when k == m then there is no point in doing erasure coding -- it -degenerates to the equivalent of the Unix "split" utility which simply splits -the input into successive segments. Similarly, when k == 1 it degenerates to -the equivalent of the unix "cp" utility -- each block is a complete copy of the -input data. The "fec" command-line tool does not implement these degenerate -cases.) - -Note that each "primary block" is a segment of the original data, so its size -is 1/k'th of the size of original data, and each "secondary block" is of the -same size, so the total space used by all the blocks is m/k times the size of -the original data (plus some padding to fill out the last primary block to be -the same size as all the others). In addition to the data contained in the -blocks themselves there are also a few pieces of metadata which are necessary -for later reconstruction. Those pieces are: 1. the value of K, 2. the value -of M, 3. the sharenum of each block, 4. the number of bytes of padding -that were used. The "fec" command-line tool compresses these pieces of data -and prepends them to the beginning of each share, so each the sharefile -produced by the "fec" command-line tool is between one and four bytes larger -than the share data alone. - -The decoding step requires as input k of the blocks which were produced by the -encoding step. The decoding step produces as output the data that was earlier -input to the encoding step. - - - * Command-Line Tool - -The bin/ directory contains two Unix-style, command-line tools "fec" and -"unfec". Execute "fec --help" or "unfec --help" for usage instructions. - -Note: a Unix-style tool like "fec" does only one thing -- in this case -erasure coding -- and leaves other tasks to other tools. Other Unix-style -tools that go well with fec include "GNU tar" for archiving multiple files and -directories into one file, "rzip" for compression, "GNU Privacy Guard" for -encryption, and "sha256sum" for integrity. It is important to do things in -order: first archive, then compress, then either encrypt or sha256sum, then -erasure code. Note that if GNU Privacy Guard is used for privacy, then it will -also ensure integrity, so the use of sha256sum is unnecessary in that case. - - - * Performance Measurements - -On my Athlon 64 2.4 GHz workstation (running Linux), the "fec" command-line -tool encoded a 160 MB file with m=100, k=94 (about 6% redundancy) in 3.9 -seconds, where the "par2" tool encoded the file with about 6% redundancy in -27 seconds. "fec" encoded the same file with m=12, k=6 (100% redundancy) in -4.1 seconds, where par2 encoded it with about 100% redundancy in 7 minutes -and 56 seconds. - -The underlying C library in benchmark mode encoded from a file at about -4.9 million bytes per second and decoded at about 5.8 million bytes per second. - -On Peter's fancy Intel Mac laptop (2.16 GHz Core Duo), it encoded from a file -at about 6.2 million bytes per second. - -On my even fancier Intel Mac laptop (2.33 GHz Core Duo), it encoded from a file -at about 6.8 million bytes per second. - -On my old PowerPC G4 867 MHz Mac laptop, it encoded from a file at about 1.3 -million bytes per second. - - - * API - -Each block is associated with "blocknum". The blocknum of each primary block is -its index (starting from zero), so the 0'th block is the first primary block, -which is the first few bytes of the file, the 1'st block is the next primary -block, which is the next few bytes of the file, and so on. The last primary -block has blocknum k-1. The blocknum of each secondary block is an arbitrary -integer between k and 255 inclusive. (When using the Python API, if you don't -specify which blocknums you want for your secondary blocks when invoking -encode(), then it will by default provide the blocks with ids from k to m-1 -inclusive.) - - ** C API - -fec_encode() takes as input an array of k pointers, where each pointer points -to a memory buffer containing the input data (i.e., the i'th buffer contains -the i'th primary block). There is also a second parameter which is an array of -the blocknums of the secondary blocks which are to be produced. (Each element -in that array is required to be the blocknum of a secondary block, i.e. it is -required to be >= k and < m.) - -The output from fec_encode() is the requested set of secondary blocks which are -written into output buffers provided by the caller. - -fec_decode() takes as input an array of k pointers, where each pointer points -to a buffer containing a block. There is also a separate input parameter which -is an array of blocknums, indicating the blocknum of each of the blocks which is -being passed in. - -The output from fec_decode() is the set of primary blocks which were missing -from the input and had to be reconstructed. These reconstructed blocks are -written into putput buffers provided by the caller. - - ** Python API - -encode() and decode() take as input a sequence of k buffers, where a "sequence" -is any object that implements the Python sequence protocol (such as a list or -tuple) and a "buffer" is any object that implements the Python buffer protocol -(such as a string or array). The contents that are required to be present in -these buffers are the same as for the C API. - -encode() also takes a list of desired blocknums. Unlike the C API, the Python -API accepts blocknums of primary blocks as well as secondary blocks in its list -of desired blocknums. encode() returns a list of buffer objects which contain -the blocks requested. For each requested block which is a primary block, the -resulting list contains a reference to the apppropriate primary block from the -input list. For each requested block which is a secondary block, the list -contains a newly created string object containing that block. - -decode() also takes a list of integers indicating the blocknums of the blocks -being passed int. decode() returns a list of buffer objects which contain all -of the primary blocks of the original data (in order). For each primary block -which was present in the input list, then the result list simply contains a -reference to the object that was passed in the input list. For each primary -block which was not present in the input, the result list contains a newly -created string object containing that primary block. - -Beware of a "gotcha" that can result from the combination of mutable data and -the fact that the Python API returns references to inputs when possible. - -Returning references to its inputs is efficient since it avoids making an -unnecessary copy of the data, but if the object which was passed as input is -mutable and if that object is mutated after the call to pyfec returns, then the -result from pyfec -- which is just a reference to that same object -- will also -be mutated. This subtlety is the price you pay for avoiding data copying. If -you don't want to have to worry about this then you can simply use immutable -objects (e.g. Python strings) to hold the data that you pass to pyfec. - - - * Utilities - -The filefec.py module has a utility function for efficiently reading a file -and encoding it piece by piece. This module is used by the "fec" and "unfec" -command-line tools from the bin/ directory. - - - * Dependencies - -A C compiler is required. To use the Python API or the command-line tools a -Python interpreter is also required. We have tested it with Python v2.4 and -v2.5. - - - * Acknowledgements - -Thanks to the author of the original fec lib, Luigi Rizzo, and the folks that -contributed to it: Phil Karn, Robert Morelos-Zaragoza, Hari Thirumoorthy, and -Dan Rubenstein. Thanks to the Mnet hackers who wrote an earlier Python -wrapper, especially Myers Carpenter and Hauke Johannknecht. Thanks to Brian -Warner and Amber O'Whielacronx for help with the API, documentation, -debugging, compression, and unit tests. Thanks to the creators of GCC -(starting with Richard M. Stallman) and Valgrind (starting with Julian Seward) -for a pair of excellent tools. Thanks to my coworkers at Allmydata -- -http://allmydata.com -- Fabrice Grinda, Peter Secor, Rob Kinninmont, Brian -Warner, Zandr Milewski, Justin Boreta, Mark Meras for sponsoring this work and -releasing it under a Free Software licence. - - -Enjoy! - -Zooko Wilcox-O'Hearn -2007-04-14 -Boulder, Colorado diff --git a/pyfec/TODO b/pyfec/TODO deleted file mode 100644 index 30f93ff..0000000 --- a/pyfec/TODO +++ /dev/null @@ -1,5 +0,0 @@ - * try Duff's device in _addmul1()? - * make sure it compiles with cygwin gcc -mno-cygwin - * memory usage analysis - * announce on pypi, sf.net, freshmeat, lwn, p2p-hackers - * try compiling with Microsoft compiler diff --git a/pyfec/bin/fec b/pyfec/bin/fec deleted file mode 100755 index 19e7daa..0000000 --- a/pyfec/bin/fec +++ /dev/null @@ -1,55 +0,0 @@ -#!/usr/bin/env python - -# import bindann -# import bindann.monkeypatch.all - -import sys - -import fec -from fec.util import argparse -from fec import filefec -from fec.util.version import Version -__version__ = Version("1.0.0a1-0-STABLE") - -if '-V' in sys.argv or '--version' in sys.argv: - print "pyfec library version: ", fec.__version__ - print "fec command-line tool version: ", __version__ - sys.exit(0) - -parser = argparse.ArgumentParser(description="Encode a file into a set of share files, a subset of which can later be used to recover the original file.") - -parser.add_argument('inputfile', help='file to encode or "-" for stdin', type=argparse.FileType('rb'), metavar='INF') -parser.add_argument('-d', '--output-dir', help='directory in which share file names will be created (default ".")', default='.', metavar='D') -parser.add_argument('-p', '--prefix', help='prefix for share file names; If omitted, the name of the input file will be used.', metavar='P') -parser.add_argument('-s', '--suffix', help='suffix for share file names (default ".fec")', default='.fec', metavar='S') -parser.add_argument('-m', '--totalshares', help='the total number of share files created (default 16)', default=16, type=int, metavar='M') -parser.add_argument('-k', '--requiredshares', help='the number of share files required to reconstruct (default 4)', default=4, type=int, metavar='K') -parser.add_argument('-f', '--force', help='overwrite any file which already in place an output file (share file)', action='store_true') -parser.add_argument('-v', '--verbose', help='print out messages about progress', action='store_true') -parser.add_argument('-V', '--version', help='print out version number and exit', action='store_true') -args = parser.parse_args() - -if args.prefix is None: - args.prefix = args.inputfile.name - if args.prefix == "": - args.prefix = "" - -if args.totalshares < 3: - print "Invalid parameters, totalshares is required to be >= 3\nPlease see the accompanying documentation." - sys.exit(1) -if args.totalshares > 256: - print "Invalid parameters, totalshares is required to be <= 256\nPlease see the accompanying documentation." - sys.exit(1) -if args.requiredshares < 2: - print "Invalid parameters, requiredshares is required to be >= 2\nPlease see the accompanying documentation." - sys.exit(1) -if args.requiredshares >= args.totalshares: - print "Invalid parameters, requiredshares is required to be < totalshares\nPlease see the accompanying documentation." - sys.exit(1) - -args.inputfile.seek(0, 2) -fsize = args.inputfile.tell() -args.inputfile.seek(0, 0) -ret = filefec.encode_to_files(args.inputfile, fsize, args.output_dir, args.prefix, args.requiredshares, args.totalshares, args.suffix, args.force, args.verbose) - -sys.exit(ret) diff --git a/pyfec/bin/unfec b/pyfec/bin/unfec deleted file mode 100755 index 9d9fcad..0000000 --- a/pyfec/bin/unfec +++ /dev/null @@ -1,49 +0,0 @@ -#!/usr/bin/env python - -# import bindann -# import bindann.monkeypatch.all - -import os, sys - -from fec.util import argparse - -import fec -from fec import filefec -from fec.util.version import Version -__version__ = Version("1.0.0a1-0-STABLE") - -if '-V' in sys.argv or '--version' in sys.argv: - print "pyfec library version: ", fec.__version__ - print "fec command-line tool version: ", __version__ - sys.exit(0) - -parser = argparse.ArgumentParser(description="Decode data from share files.") - -parser.add_argument('-o', '--outputfile', required=True, help='file to write the resulting data to, or "-" for stdout', type=str, metavar='OUTF') -parser.add_argument('sharefiles', nargs='*', help='shares file to read the encoded data from', type=argparse.FileType('rb'), metavar='SHAREFILE') -parser.add_argument('-v', '--verbose', help='print out messages about progress', action='store_true') -parser.add_argument('-f', '--force', help='overwrite any file which already in place of the output file', action='store_true') -parser.add_argument('-V', '--version', help='print out version number and exit', action='store_true') -args = parser.parse_args() - -if len(args.sharefiles) < 2: - print "At least two sharefiles are required." - sys.exit(1) - -if args.force: - outf = open(args.outputfile, 'wb') -else: - try: - outfd = os.open(args.outputfile, os.O_WRONLY|os.O_CREAT|os.O_EXCL) - except OSError: - print "There is already a file named %r -- aborting. Use --force to overwrite." % (args.outputfile,) - sys.exit(2) - outf = os.fdopen(outfd, "wb") - -try: - ret = filefec.decode_from_files(outf, args.sharefiles, args.verbose) -except filefec.InsufficientShareFilesError, e: - print str(e) - sys.exit(3) - -sys.exit(ret) diff --git a/pyfec/fec/__init__.py b/pyfec/fec/__init__.py deleted file mode 100644 index a82d775..0000000 --- a/pyfec/fec/__init__.py +++ /dev/null @@ -1,21 +0,0 @@ -""" -pyfec -- fast forward error correction library with Python interface - -maintainer web site: U{http://zooko.com/} - -pyfec web site: U{http://www.allmydata.com/source/pyfec} -""" - -from util.version import Version - -# For an explanation of what the parts of the version string mean, -# please see pyutil.version. -__version__ = Version("1.0.0a1-2-STABLE") - -# Please put a URL or other note here which shows where to get the branch of -# development from which this version grew. -__sources__ = ["http://www.allmydata.com/source/pyfec",] - -from _fec import Encoder, Decoder, Error -import filefec - diff --git a/pyfec/fec/_fecmodule.c b/pyfec/fec/_fecmodule.c deleted file mode 100644 index a4982b1..0000000 --- a/pyfec/fec/_fecmodule.c +++ /dev/null @@ -1,614 +0,0 @@ -/** - * pyfec -- fast forward error correction library with Python interface - * - * Copyright (C) 2007 Allmydata, Inc. - * Author: Zooko Wilcox-O'Hearn - * mailto:zooko@zooko.com - * - * This file is part of pyfec. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version 2 - * of the License, or (at your option) any later version. This program also - * comes with the added permission that, in the case that you are obligated to - * release a derived work under this licence (as per section 2.b of the GPL), - * you may delay the fulfillment of this obligation for up to 12 months. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. - */ - -/** - * based on fecmodule.c by the Mnet Project, especially Myers Carpenter and - * Hauke Johannknecht - */ - -#include -#include - -#if (PY_VERSION_HEX < 0x02050000) -typedef int Py_ssize_t; -#endif - -#include "fec.h" - -#include "stdarg.h" - -static PyObject *py_fec_error; -static PyObject *py_raise_fec_error (const char *format, ...); - -static char fec__doc__[] = "\ -FEC - Forward Error Correction \n\ -"; - -static PyObject * -py_raise_fec_error(const char *format, ...) { - char exceptionMsg[1024]; - va_list ap; - - va_start (ap, format); - vsnprintf (exceptionMsg, 1024, format, ap); - va_end (ap); - exceptionMsg[1023]='\0'; - PyErr_SetString (py_fec_error, exceptionMsg); - return NULL; -} - -static char Encoder__doc__[] = "\ -Hold static encoder state (an in-memory table for matrix multiplication), and k and m parameters, and provide {encode()} method.\n\n\ -@param k: the number of packets required for reconstruction \n\ -@param m: the number of packets generated \n\ -"; - -typedef struct { - PyObject_HEAD - - /* expose these */ - short kk; - short mm; - - /* internal */ - fec_t* fec_matrix; -} Encoder; - -static PyObject * -Encoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { - Encoder *self; - - self = (Encoder*)type->tp_alloc(type, 0); - if (self != NULL) { - self->kk = 0; - self->mm = 0; - self->fec_matrix = NULL; - } - - return (PyObject *)self; -} - -static int -Encoder_init(Encoder *self, PyObject *args, PyObject *kwdict) { - static char *kwlist[] = { - "k", - "m", - NULL - }; - int ink, inm; - if (!PyArg_ParseTupleAndKeywords(args, kwdict, "ii", kwlist, &ink, &inm)) - return -1; - - if (ink < 1) { - py_raise_fec_error("Precondition violation: first argument is required to be greater than or equal to 1, but it was %d", self->kk); - return -1; - } - if (inm < 1) { - py_raise_fec_error("Precondition violation: second argument is required to be greater than or equal to 1, but it was %d", self->mm); - return -1; - } - if (inm > 256) { - py_raise_fec_error("Precondition violation: second argument is required to be less than or equal to 256, but it was %d", self->mm); - return -1; - } - if (ink > inm) { - py_raise_fec_error("Precondition violation: first argument is required to be less than or equal to the second argument, but they were %d and %d respectively", ink, inm); - return -1; - } - self->kk = (short)ink; - self->mm = (short)inm; - self->fec_matrix = fec_new(self->kk, self->mm); - - return 0; -} - -static char Encoder_encode__doc__[] = "\ -Encode data into m packets.\n\ -\n\ -@param inblocks: a sequence of k buffers of data to encode -- these are the k primary blocks, i.e. the input data split into k pieces (for best performance, make it a tuple instead of a list); All blocks are required to be the same length.\n\ -@param desired_blocks_nums optional sequence of blocknums indicating which blocks to produce and return; If None, all m blocks will be returned (in order). (For best performance, make it a tuple instead of a list.)\n\ -@returns: a list of buffers containing the requested blocks; Note that if any of the input blocks were 'primary blocks', i.e. their blocknum was < k, then the result sequence will contain a Python reference to the same Python object as was passed in. As long as the Python object in question is immutable (i.e. a string) then you don't have to think about this detail, but if it is mutable (i.e. an array), then you have to be aware that if you subsequently mutate the contents of that object then that will also change the contents of the sequence that was returned from this call to encode().\n\ -"; - -static PyObject * -Encoder_encode(Encoder *self, PyObject *args) { - PyObject* inblocks; - PyObject* desired_blocks_nums = NULL; /* The blocknums of the blocks that should be returned. */ - PyObject* result = NULL; - - if (!PyArg_ParseTuple(args, "O|O", &inblocks, &desired_blocks_nums)) - return NULL; - - gf* check_blocks_produced[self->mm - self->kk]; /* This is an upper bound -- we will actually use only num_check_blocks_produced of these elements (see below). */ - PyObject* pystrs_produced[self->mm - self->kk]; /* This is an upper bound -- we will actually use only num_check_blocks_produced of these elements (see below). */ - unsigned num_check_blocks_produced = 0; /* The first num_check_blocks_produced elements of the check_blocks_produced array and of the pystrs_produced array will be used. */ - const gf* incblocks[self->kk]; - unsigned num_desired_blocks; - PyObject* fast_desired_blocks_nums = NULL; - PyObject** fast_desired_blocks_nums_items; - unsigned c_desired_blocks_nums[self->mm]; - unsigned c_desired_checkblocks_ids[self->mm - self->kk]; - unsigned i; - PyObject* fastinblocks = NULL; - - for (i=0; imm - self->kk; i++) - pystrs_produced[i] = NULL; - if (desired_blocks_nums) { - fast_desired_blocks_nums = PySequence_Fast(desired_blocks_nums, "Second argument (optional) was not a sequence."); - if (!fast_desired_blocks_nums) - goto err; - num_desired_blocks = PySequence_Fast_GET_SIZE(fast_desired_blocks_nums); - fast_desired_blocks_nums_items = PySequence_Fast_ITEMS(fast_desired_blocks_nums); - for (i=0; i= self->kk) - num_check_blocks_produced++; - } - } else { - num_desired_blocks = self->mm; - for (i=0; imm - self->kk; - } - - fastinblocks = PySequence_Fast(inblocks, "First argument was not a sequence."); - if (!fastinblocks) - goto err; - - if (PySequence_Fast_GET_SIZE(fastinblocks) != self->kk) { - py_raise_fec_error("Precondition violation: Wrong length -- first argument is required to contain exactly k blocks. len(first): %d, k: %d", PySequence_Fast_GET_SIZE(fastinblocks), self->kk); - goto err; - } - - /* Construct a C array of gf*'s of the input data. */ - PyObject** fastinblocksitems = PySequence_Fast_ITEMS(fastinblocks); - if (!fastinblocksitems) - goto err; - Py_ssize_t sz, oldsz = 0; - for (i=0; ikk; i++) { - if (!PyObject_CheckReadBuffer(fastinblocksitems[i])) { - py_raise_fec_error("Precondition violation: %u'th item is required to offer the single-segment read character buffer protocol, but it does not.\n", i); - goto err; - } - if (PyObject_AsReadBuffer(fastinblocksitems[i], (const void**)&(incblocks[i]), &sz)) - goto err; - if (oldsz != 0 && oldsz != sz) { - py_raise_fec_error("Precondition violation: Input blocks are required to be all the same length. oldsz: %Zu, sz: %Zu\n", oldsz, sz); - goto err; - } - oldsz = sz; - } - - /* Allocate space for all of the check blocks. */ - unsigned char check_block_index = 0; /* index into the check_blocks_produced and (parallel) pystrs_produced arrays */ - for (i=0; i= self->kk) { - c_desired_checkblocks_ids[check_block_index] = c_desired_blocks_nums[i]; - pystrs_produced[check_block_index] = PyString_FromStringAndSize(NULL, sz); - if (pystrs_produced[check_block_index] == NULL) - goto err; - check_blocks_produced[check_block_index] = (gf*)PyString_AsString(pystrs_produced[check_block_index]); - if (check_blocks_produced[check_block_index] == NULL) - goto err; - check_block_index++; - } - } - assert (check_block_index == num_check_blocks_produced); - - /* Encode any check blocks that are needed. */ - fec_encode(self->fec_matrix, incblocks, check_blocks_produced, c_desired_checkblocks_ids, num_check_blocks_produced, sz); - - /* Wrap all requested blocks up into a Python list of Python strings. */ - result = PyList_New(num_desired_blocks); - if (result == NULL) - goto err; - check_block_index = 0; - for (i=0; ikk) { - Py_INCREF(fastinblocksitems[c_desired_blocks_nums[i]]); - if (PyList_SetItem(result, i, fastinblocksitems[c_desired_blocks_nums[i]]) == -1) { - Py_DECREF(fastinblocksitems[c_desired_blocks_nums[i]]); - goto err; - } - } else { - if (PyList_SetItem(result, i, pystrs_produced[check_block_index]) == -1) - goto err; - pystrs_produced[check_block_index] = NULL; - check_block_index++; - } - } - - goto cleanup; - err: - for (i=0; ifec_matrix); - self->ob_type->tp_free((PyObject*)self); -} - -static PyMethodDef Encoder_methods[] = { - {"encode", (PyCFunction)Encoder_encode, METH_VARARGS, Encoder_encode__doc__}, - {NULL}, -}; - -static PyMemberDef Encoder_members[] = { - {"k", T_SHORT, offsetof(Encoder, kk), READONLY, "k"}, - {"m", T_SHORT, offsetof(Encoder, mm), READONLY, "m"}, - {NULL} /* Sentinel */ -}; - -static PyTypeObject Encoder_type = { - PyObject_HEAD_INIT(NULL) - 0, /*ob_size*/ - "fec.Encoder", /*tp_name*/ - sizeof(Encoder), /*tp_basicsize*/ - 0, /*tp_itemsize*/ - (destructor)Encoder_dealloc, /*tp_dealloc*/ - 0, /*tp_print*/ - 0, /*tp_getattr*/ - 0, /*tp_setattr*/ - 0, /*tp_compare*/ - 0, /*tp_repr*/ - 0, /*tp_as_number*/ - 0, /*tp_as_sequence*/ - 0, /*tp_as_mapping*/ - 0, /*tp_hash */ - 0, /*tp_call*/ - 0, /*tp_str*/ - 0, /*tp_getattro*/ - 0, /*tp_setattro*/ - 0, /*tp_as_buffer*/ - Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/ - Encoder__doc__, /* tp_doc */ - 0, /* tp_traverse */ - 0, /* tp_clear */ - 0, /* tp_richcompare */ - 0, /* tp_weaklistoffset */ - 0, /* tp_iter */ - 0, /* tp_iternext */ - Encoder_methods, /* tp_methods */ - Encoder_members, /* tp_members */ - 0, /* tp_getset */ - 0, /* tp_base */ - 0, /* tp_dict */ - 0, /* tp_descr_get */ - 0, /* tp_descr_set */ - 0, /* tp_dictoffset */ - (initproc)Encoder_init, /* tp_init */ - 0, /* tp_alloc */ - Encoder_new, /* tp_new */ -}; - -static char Decoder__doc__[] = "\ -Hold static decoder state (an in-memory table for matrix multiplication), and k and m parameters, and provide {decode()} method.\n\n\ -@param k: the number of packets required for reconstruction \n\ -@param m: the number of packets generated \n\ -"; - -typedef struct { - PyObject_HEAD - - /* expose these */ - short kk; - short mm; - - /* internal */ - fec_t* fec_matrix; -} Decoder; - -static PyObject * -Decoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { - Decoder *self; - - self = (Decoder*)type->tp_alloc(type, 0); - if (self != NULL) { - self->kk = 0; - self->mm = 0; - self->fec_matrix = NULL; - } - - return (PyObject *)self; -} - -static int -Decoder_init(Encoder *self, PyObject *args, PyObject *kwdict) { - static char *kwlist[] = { - "k", - "m", - NULL - }; - - int ink, inm; - if (!PyArg_ParseTupleAndKeywords(args, kwdict, "ii", kwlist, &ink, &inm)) - return -1; - - if (ink < 1) { - py_raise_fec_error("Precondition violation: first argument is required to be greater than or equal to 1, but it was %d", self->kk); - return -1; - } - if (inm < 1) { - py_raise_fec_error("Precondition violation: second argument is required to be greater than or equal to 1, but it was %d", self->mm); - return -1; - } - if (inm > 256) { - py_raise_fec_error("Precondition violation: second argument is required to be less than or equal to 256, but it was %d", self->mm); - return -1; - } - if (ink > inm) { - py_raise_fec_error("Precondition violation: first argument is required to be less than or equal to the second argument, but they were %d and %d respectively", ink, inm); - return -1; - } - self->kk = (short)ink; - self->mm = (short)inm; - self->fec_matrix = fec_new(self->kk, self->mm); - - return 0; -} - -#define SWAP(a,b,t) {t tmp; tmp=a; a=b; b=tmp;} - -static char Decoder_decode__doc__[] = "\ -Decode a list blocks into a list of segments.\n\ -@param blocks a sequence of buffers containing block data (for best performance, make it a tuple instead of a list)\n\ -@param blocknums a sequence of integers of the blocknum for each block in blocks (for best performance, make it a tuple instead of a list)\n\ -\n\ -@return a list of strings containing the segment data (i.e. ''.join(retval) yields a string containing the decoded data)\n\ -"; - -static PyObject * -Decoder_decode(Decoder *self, PyObject *args) { - PyObject*restrict blocks; - PyObject*restrict blocknums; - PyObject* result = NULL; - - if (!PyArg_ParseTuple(args, "OO", &blocks, &blocknums)) - return NULL; - - const gf*restrict cblocks[self->kk]; - unsigned cblocknums[self->kk]; - gf*restrict recoveredcstrs[self->kk]; /* self->kk is actually an upper bound -- we probably won't need all of this space. */ - PyObject*restrict recoveredpystrs[self->kk]; /* self->kk is actually an upper bound -- we probably won't need all of this space. */ - unsigned i; - for (i=0; ikk; i++) - recoveredpystrs[i] = NULL; - PyObject*restrict fastblocknums = NULL; - PyObject*restrict fastblocks = PySequence_Fast(blocks, "First argument was not a sequence."); - if (!fastblocks) - goto err; - fastblocknums = PySequence_Fast(blocknums, "Second argument was not a sequence."); - if (!fastblocknums) - goto err; - - if (PySequence_Fast_GET_SIZE(fastblocks) != self->kk) { - py_raise_fec_error("Precondition violation: Wrong length -- first argument is required to contain exactly k blocks. len(first): %d, k: %d", PySequence_Fast_GET_SIZE(fastblocks), self->kk); - goto err; - } - if (PySequence_Fast_GET_SIZE(fastblocknums) != self->kk) { - py_raise_fec_error("Precondition violation: Wrong length -- blocknums is required to contain exactly k blocks. len(blocknums): %d, k: %d", PySequence_Fast_GET_SIZE(fastblocknums), self->kk); - goto err; - } - - /* Construct a C array of gf*'s of the data and another of C ints of the blocknums. */ - unsigned needtorecover=0; - PyObject** fastblocknumsitems = PySequence_Fast_ITEMS(fastblocknums); - if (!fastblocknumsitems) - goto err; - PyObject** fastblocksitems = PySequence_Fast_ITEMS(fastblocks); - if (!fastblocksitems) - goto err; - Py_ssize_t sz, oldsz = 0; - for (i=0; ikk; i++) { - if (!PyInt_Check(fastblocknumsitems[i])) { - py_raise_fec_error("Precondition violation: second argument is required to contain int."); - goto err; - } - long tmpl = PyInt_AsLong(fastblocknumsitems[i]); - if (tmpl < 0 || tmpl > 255) { - py_raise_fec_error("Precondition violation: block nums can't be less than zero or greater than 255. %ld\n", tmpl); - goto err; - } - cblocknums[i] = (unsigned)tmpl; - if (cblocknums[i] >= self->kk) - needtorecover+=1; - - if (!PyObject_CheckReadBuffer(fastblocksitems[i])) { - py_raise_fec_error("Precondition violation: %u'th item is required to offer the single-segment read character buffer protocol, but it does not.\n", i); - goto err; - } - if (PyObject_AsReadBuffer(fastblocksitems[i], (const void**)&(cblocks[i]), &sz)) - goto err; - if (oldsz != 0 && oldsz != sz) { - py_raise_fec_error("Precondition violation: Input blocks are required to be all the same length. oldsz: %Zu, sz: %Zu\n", oldsz, sz); - goto err; - } - oldsz = sz; - } - - /* move src packets into position */ - for (i=0; ikk;) { - if (cblocknums[i] >= self->kk || cblocknums[i] == i) - i++; - else { - /* put pkt in the right position. */ - unsigned c = cblocknums[i]; - - SWAP (cblocknums[i], cblocknums[c], int); - SWAP (cblocks[i], cblocks[c], const gf*); - SWAP (fastblocksitems[i], fastblocksitems[c], PyObject*); - } - } - - /* Allocate space for all of the recovered blocks. */ - for (i=0; ifec_matrix, cblocks, recoveredcstrs, cblocknums, sz); - - /* Wrap up both original primary blocks and decoded blocks into a Python list of Python strings. */ - unsigned nextrecoveredix=0; - result = PyList_New(self->kk); - if (result == NULL) - goto err; - for (i=0; ikk; i++) { - if (cblocknums[i] == i) { - /* Original primary block. */ - Py_INCREF(fastblocksitems[i]); - if (PyList_SetItem(result, i, fastblocksitems[i]) == -1) { - Py_DECREF(fastblocksitems[i]); - goto err; - } - } else { - /* Recovered block. */ - if (PyList_SetItem(result, i, recoveredpystrs[nextrecoveredix]) == -1) - goto err; - recoveredpystrs[nextrecoveredix] = NULL; - nextrecoveredix++; - } - } - - goto cleanup; - err: - for (i=0; ikk; i++) - Py_XDECREF(recoveredpystrs[i]); - Py_XDECREF(result); result = NULL; - cleanup: - Py_XDECREF(fastblocks); fastblocks=NULL; - Py_XDECREF(fastblocknums); fastblocknums=NULL; - return result; -} - -static void -Decoder_dealloc(Decoder * self) { - fec_free(self->fec_matrix); - self->ob_type->tp_free((PyObject*)self); -} - -static PyMethodDef Decoder_methods[] = { - {"decode", (PyCFunction)Decoder_decode, METH_VARARGS, Decoder_decode__doc__}, - {NULL}, -}; - -static PyMemberDef Decoder_members[] = { - {"k", T_SHORT, offsetof(Encoder, kk), READONLY, "k"}, - {"m", T_SHORT, offsetof(Encoder, mm), READONLY, "m"}, - {NULL} /* Sentinel */ -}; - -static PyTypeObject Decoder_type = { - PyObject_HEAD_INIT(NULL) - 0, /*ob_size*/ - "fec.Decoder", /*tp_name*/ - sizeof(Decoder), /*tp_basicsize*/ - 0, /*tp_itemsize*/ - (destructor)Decoder_dealloc, /*tp_dealloc*/ - 0, /*tp_print*/ - 0, /*tp_getattr*/ - 0, /*tp_setattr*/ - 0, /*tp_compare*/ - 0, /*tp_repr*/ - 0, /*tp_as_number*/ - 0, /*tp_as_sequence*/ - 0, /*tp_as_mapping*/ - 0, /*tp_hash */ - 0, /*tp_call*/ - 0, /*tp_str*/ - 0, /*tp_getattro*/ - 0, /*tp_setattro*/ - 0, /*tp_as_buffer*/ - Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/ - Decoder__doc__, /* tp_doc */ - 0, /* tp_traverse */ - 0, /* tp_clear */ - 0, /* tp_richcompare */ - 0, /* tp_weaklistoffset */ - 0, /* tp_iter */ - 0, /* tp_iternext */ - Decoder_methods, /* tp_methods */ - Decoder_members, /* tp_members */ - 0, /* tp_getset */ - 0, /* tp_base */ - 0, /* tp_dict */ - 0, /* tp_descr_get */ - 0, /* tp_descr_set */ - 0, /* tp_dictoffset */ - (initproc)Decoder_init, /* tp_init */ - 0, /* tp_alloc */ - Decoder_new, /* tp_new */ -}; - -static PyMethodDef fec_methods[] = { - {NULL} -}; - -#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */ -#define PyMODINIT_FUNC void -#endif -PyMODINIT_FUNC -init_fec(void) { - PyObject *module; - PyObject *module_dict; - - if (PyType_Ready(&Encoder_type) < 0) - return; - if (PyType_Ready(&Decoder_type) < 0) - return; - - module = Py_InitModule3("_fec", fec_methods, fec__doc__); - if (module == NULL) - return; - - Py_INCREF(&Encoder_type); - Py_INCREF(&Decoder_type); - - PyModule_AddObject(module, "Encoder", (PyObject *)&Encoder_type); - PyModule_AddObject(module, "Decoder", (PyObject *)&Decoder_type); - - module_dict = PyModule_GetDict(module); - py_fec_error = PyErr_NewException("_fec.Error", NULL, NULL); - PyDict_SetItemString(module_dict, "Error", py_fec_error); -} - diff --git a/pyfec/fec/easyfec.py b/pyfec/fec/easyfec.py deleted file mode 100644 index 8fc1506..0000000 --- a/pyfec/fec/easyfec.py +++ /dev/null @@ -1,37 +0,0 @@ -import fec - -# div_ceil() was copied from the pyutil library. -def div_ceil(n, d): - """ - The smallest integer k such that k*d >= n. - """ - return (n/d) + (n%d != 0) - -class Encoder(object): - def __init__(self, k, m): - self.fec = fec.Encoder(k, m) - - def encode(self, data): - """ - @param data: string - """ - chunksize = div_ceil(len(data), self.fec.k) - numchunks = div_ceil(len(data), chunksize) - l = [ data[i:i+chunksize] for i in range(0, len(data), chunksize) ] - # padding - if len(l[-1]) != len(l[0]): - l[-1] = l[-1] + ('\x00'*(len(l[0])-len(l[-1]))) - res = self.fec.encode(l) - return res - -class Decoder(object): - def __init__(self, k, m): - self.fec = fec.Decoder(k, m) - - def decode(self, blocks, sharenums, padlen=0): - blocks = self.fec.decode(blocks, sharenums) - data = ''.join(blocks) - if padlen: - data = data[:-padlen] - return data - diff --git a/pyfec/fec/fec.c b/pyfec/fec/fec.c deleted file mode 100644 index dce7706..0000000 --- a/pyfec/fec/fec.c +++ /dev/null @@ -1,616 +0,0 @@ -/** - * pyfec -- fast forward error correction library with Python interface - * - * Copyright (C) 2007 Allmydata, Inc. - * Author: Zooko Wilcox-O'Hearn - * mailto:zooko@zooko.com - * - * This file is part of pyfec. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License as published by the Free - * Software Foundation; either version 2 of the License, or (at your option) - * any later version. This program also comes with the added permission that, - * in the case that you are obligated to release a derived work under this - * licence (as per section 2.b of the GPL), you may delay the fulfillment of - * this obligation for up to 12 months. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. - */ - -/* - * Much of this work is derived from the "fec" software by Luigi Rizzo, et - * al., the copyright notice and licence terms of which are included below - * for reference. - * fec.c -- forward error correction based on Vandermonde matrices - * 980624 - * (C) 1997-98 Luigi Rizzo (luigi@iet.unipi.it) - * - * Portions derived from code by Phil Karn (karn@ka9q.ampr.org), - * Robert Morelos-Zaragoza (robert@spectra.eng.hawaii.edu) and Hari - * Thirumoorthy (harit@spectra.eng.hawaii.edu), Aug 1995 - * - * Modifications by Dan Rubenstein (see Modifications.txt for - * their description. - * Modifications (C) 1998 Dan Rubenstein (drubenst@cs.umass.edu) - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, - * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A - * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS - * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, - * OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, - * OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR - * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT - * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY - * OF SUCH DAMAGE. - */ - -#include -#include -#include -#include - -#include "fec.h" - - -/* - * If you get a error returned (negative value) from a fec_* function, - * look in here for the error message. - */ - -#define FEC_ERROR_SIZE 1025 -char fec_error[FEC_ERROR_SIZE+1]; - -#define ERR(...) (snprintf(fec_error, FEC_ERROR_SIZE, __VA_ARGS__)) - -/* - * Primitive polynomials - see Lin & Costello, Appendix A, - * and Lee & Messerschmitt, p. 453. - */ -static const char*const Pp="101110001"; - - -/* - * To speed up computations, we have tables for logarithm, exponent and - * inverse of a number. We use a table for multiplication as well (it takes - * 64K, no big deal even on a PDA, especially because it can be - * pre-initialized an put into a ROM!), otherwhise we use a table of - * logarithms. In any case the macro gf_mul(x,y) takes care of - * multiplications. - */ - -static gf gf_exp[510]; /* index->poly form conversion table */ -static int gf_log[256]; /* Poly->index form conversion table */ -static gf inverse[256]; /* inverse of field elem. */ - /* inv[\alpha**i]=\alpha**(GF_SIZE-i-1) */ - -/* - * modnn(x) computes x % GF_SIZE, where GF_SIZE is 2**GF_BITS - 1, - * without a slow divide. - */ -static inline gf -modnn(int x) { - while (x >= 255) { - x -= 255; - x = (x >> 8) + (x & 255); - } - return x; -} - -#define SWAP(a,b,t) {t tmp; tmp=a; a=b; b=tmp;} - -/* - * gf_mul(x,y) multiplies two numbers. It is much faster to use a - * multiplication table. - * - * USE_GF_MULC, GF_MULC0(c) and GF_ADDMULC(x) can be used when multiplying - * many numbers by the same constant. In this case the first call sets the - * constant, and others perform the multiplications. A value related to the - * multiplication is held in a local variable declared with USE_GF_MULC . See - * usage in _addmul1(). - */ -static gf gf_mul_table[256][256]; - -#define gf_mul(x,y) gf_mul_table[x][y] - -#define USE_GF_MULC register gf * __gf_mulc_ -#define GF_MULC0(c) __gf_mulc_ = gf_mul_table[c] -#define GF_ADDMULC(dst, x) dst ^= __gf_mulc_[x] - -/* - * Generate GF(2**m) from the irreducible polynomial p(X) in p[0]..p[m] - * Lookup tables: - * index->polynomial form gf_exp[] contains j= \alpha^i; - * polynomial form -> index form gf_log[ j = \alpha^i ] = i - * \alpha=x is the primitive element of GF(2^m) - * - * For efficiency, gf_exp[] has size 2*GF_SIZE, so that a simple - * multiplication of two numbers can be resolved without calling modnn - */ -static void -_init_mul_table(void) { - int i, j; - for (i = 0; i < 256; i++) - for (j = 0; j < 256; j++) - gf_mul_table[i][j] = gf_exp[modnn (gf_log[i] + gf_log[j])]; - - for (j = 0; j < 256; j++) - gf_mul_table[0][j] = gf_mul_table[j][0] = 0; -} - -/* - * i use malloc so many times, it is easier to put checks all in - * one place. - */ -static void * -my_malloc (int sz, char *err_string) { - void *p = malloc (sz); - if (p == NULL) { - ERR("Malloc failure allocating %s\n", err_string); - exit (1); - } - return p; -} - -#define NEW_GF_MATRIX(rows, cols) \ - (gf*)my_malloc(rows * cols, " ## __LINE__ ## " ) - -/* - * initialize the data structures used for computations in GF. - */ -static void -generate_gf (void) { - int i; - gf mask; - - mask = 1; /* x ** 0 = 1 */ - gf_exp[8] = 0; /* will be updated at the end of the 1st loop */ - /* - * first, generate the (polynomial representation of) powers of \alpha, - * which are stored in gf_exp[i] = \alpha ** i . - * At the same time build gf_log[gf_exp[i]] = i . - * The first 8 powers are simply bits shifted to the left. - */ - for (i = 0; i < 8; i++, mask <<= 1) { - gf_exp[i] = mask; - gf_log[gf_exp[i]] = i; - /* - * If Pp[i] == 1 then \alpha ** i occurs in poly-repr - * gf_exp[8] = \alpha ** 8 - */ - if (Pp[i] == '1') - gf_exp[8] ^= mask; - } - /* - * now gf_exp[8] = \alpha ** 8 is complete, so can also - * compute its inverse. - */ - gf_log[gf_exp[8]] = 8; - /* - * Poly-repr of \alpha ** (i+1) is given by poly-repr of - * \alpha ** i shifted left one-bit and accounting for any - * \alpha ** 8 term that may occur when poly-repr of - * \alpha ** i is shifted. - */ - mask = 1 << 7; - for (i = 9; i < 255; i++) { - if (gf_exp[i - 1] >= mask) - gf_exp[i] = gf_exp[8] ^ ((gf_exp[i - 1] ^ mask) << 1); - else - gf_exp[i] = gf_exp[i - 1] << 1; - gf_log[gf_exp[i]] = i; - } - /* - * log(0) is not defined, so use a special value - */ - gf_log[0] = 255; - /* set the extended gf_exp values for fast multiply */ - for (i = 0; i < 255; i++) - gf_exp[i + 255] = gf_exp[i]; - - /* - * again special cases. 0 has no inverse. This used to - * be initialized to 255, but it should make no difference - * since noone is supposed to read from here. - */ - inverse[0] = 0; - inverse[1] = 1; - for (i = 2; i <= 255; i++) - inverse[i] = gf_exp[255 - gf_log[i]]; -} - -/* - * Various linear algebra operations that i use often. - */ - -/* - * addmul() computes dst[] = dst[] + c * src[] - * This is used often, so better optimize it! Currently the loop is - * unrolled 16 times, a good value for 486 and pentium-class machines. - * The case c=0 is also optimized, whereas c=1 is not. These - * calls are unfrequent in my typical apps so I did not bother. - */ -#define addmul(dst, src, c, sz) \ - if (c != 0) _addmul1(dst, src, c, sz) - -#define UNROLL 16 /* 1, 4, 8, 16 */ -static void -_addmul1(register gf*restrict dst, const register gf*restrict src, gf c, size_t sz) { - USE_GF_MULC; - const gf* lim = &dst[sz - UNROLL + 1]; - - GF_MULC0 (c); - -#if (UNROLL > 1) /* unrolling by 8/16 is quite effective on the pentium */ - for (; dst < lim; dst += UNROLL, src += UNROLL) { - GF_ADDMULC (dst[0], src[0]); - GF_ADDMULC (dst[1], src[1]); - GF_ADDMULC (dst[2], src[2]); - GF_ADDMULC (dst[3], src[3]); -#if (UNROLL > 4) - GF_ADDMULC (dst[4], src[4]); - GF_ADDMULC (dst[5], src[5]); - GF_ADDMULC (dst[6], src[6]); - GF_ADDMULC (dst[7], src[7]); -#endif -#if (UNROLL > 8) - GF_ADDMULC (dst[8], src[8]); - GF_ADDMULC (dst[9], src[9]); - GF_ADDMULC (dst[10], src[10]); - GF_ADDMULC (dst[11], src[11]); - GF_ADDMULC (dst[12], src[12]); - GF_ADDMULC (dst[13], src[13]); - GF_ADDMULC (dst[14], src[14]); - GF_ADDMULC (dst[15], src[15]); -#endif - } -#endif - lim += UNROLL - 1; - for (; dst < lim; dst++, src++) /* final components */ - GF_ADDMULC (*dst, *src); -} - -/* - * computes C = AB where A is n*k, B is k*m, C is n*m - */ -static void -_matmul(gf * a, gf * b, gf * c, unsigned n, unsigned k, unsigned m) { - unsigned row, col, i; - - for (row = 0; row < n; row++) { - for (col = 0; col < m; col++) { - gf *pa = &a[row * k]; - gf *pb = &b[col]; - gf acc = 0; - for (i = 0; i < k; i++, pa++, pb += m) - acc ^= gf_mul (*pa, *pb); - c[row * m + col] = acc; - } - } -} - -/* - * _invert_mat() takes a matrix and produces its inverse - * k is the size of the matrix. - * (Gauss-Jordan, adapted from Numerical Recipes in C) - * Return non-zero if singular. - */ -static void -_invert_mat(gf* src, unsigned k) { - gf c, *p; - unsigned irow = 0; - unsigned icol = 0; - unsigned row, col, i, ix; - - unsigned* indxc = (unsigned*) my_malloc (k * sizeof(unsigned), "indxc"); - unsigned* indxr = (unsigned*) my_malloc (k * sizeof(unsigned), "indxr"); - unsigned* ipiv = (unsigned*) my_malloc (k * sizeof(unsigned), "ipiv"); - gf *id_row = NEW_GF_MATRIX (1, k); - gf *temp_row = NEW_GF_MATRIX (1, k); - - memset (id_row, '\0', k * sizeof (gf)); - /* - * ipiv marks elements already used as pivots. - */ - for (i = 0; i < k; i++) - ipiv[i] = 0; - - for (col = 0; col < k; col++) { - gf *pivot_row; - /* - * Zeroing column 'col', look for a non-zero element. - * First try on the diagonal, if it fails, look elsewhere. - */ - if (ipiv[col] != 1 && src[col * k + col] != 0) { - irow = col; - icol = col; - goto found_piv; - } - for (row = 0; row < k; row++) { - if (ipiv[row] != 1) { - for (ix = 0; ix < k; ix++) { - if (ipiv[ix] == 0) { - if (src[row * k + ix] != 0) { - irow = row; - icol = ix; - goto found_piv; - } - } else if (ipiv[ix] > 1) { - ERR("singular matrix"); - goto fail; - } - } - } - } - found_piv: - ++(ipiv[icol]); - /* - * swap rows irow and icol, so afterwards the diagonal - * element will be correct. Rarely done, not worth - * optimizing. - */ - if (irow != icol) - for (ix = 0; ix < k; ix++) - SWAP (src[irow * k + ix], src[icol * k + ix], gf); - indxr[col] = irow; - indxc[col] = icol; - pivot_row = &src[icol * k]; - c = pivot_row[icol]; - if (c == 0) { - ERR("singular matrix 2"); - goto fail; - } - if (c != 1) { /* otherwhise this is a NOP */ - /* - * this is done often , but optimizing is not so - * fruitful, at least in the obvious ways (unrolling) - */ - c = inverse[c]; - pivot_row[icol] = 1; - for (ix = 0; ix < k; ix++) - pivot_row[ix] = gf_mul (c, pivot_row[ix]); - } - /* - * from all rows, remove multiples of the selected row - * to zero the relevant entry (in fact, the entry is not zero - * because we know it must be zero). - * (Here, if we know that the pivot_row is the identity, - * we can optimize the addmul). - */ - id_row[icol] = 1; - if (memcmp (pivot_row, id_row, k * sizeof (gf)) != 0) { - for (p = src, ix = 0; ix < k; ix++, p += k) { - if (ix != icol) { - c = p[icol]; - p[icol] = 0; - addmul (p, pivot_row, c, k); - } - } - } - id_row[icol] = 0; - } /* done all columns */ - for (col = k; col > 0; col--) - if (indxr[col-1] != indxc[col-1]) - for (row = 0; row < k; row++) - SWAP (src[row * k + indxr[col-1]], src[row * k + indxc[col-1]], gf); - fail: - free (indxc); - free (indxr); - free (ipiv); - free (id_row); - free (temp_row); - return; -} - -/* - * fast code for inverting a vandermonde matrix. - * - * NOTE: It assumes that the matrix is not singular and _IS_ a vandermonde - * matrix. Only uses the second column of the matrix, containing the p_i's. - * - * Algorithm borrowed from "Numerical recipes in C" -- sec.2.8, but largely - * revised for my purposes. - * p = coefficients of the matrix (p_i) - * q = values of the polynomial (known) - */ -void -_invert_vdm (gf* src, unsigned k) { - unsigned i, j, row, col; - gf *b, *c, *p; - gf t, xx; - - if (k == 1) /* degenerate case, matrix must be p^0 = 1 */ - return; - /* - * c holds the coefficient of P(x) = Prod (x - p_i), i=0..k-1 - * b holds the coefficient for the matrix inversion - */ - c = NEW_GF_MATRIX (1, k); - b = NEW_GF_MATRIX (1, k); - - p = NEW_GF_MATRIX (1, k); - - for (j = 1, i = 0; i < k; i++, j += k) { - c[i] = 0; - p[i] = src[j]; /* p[i] */ - } - /* - * construct coeffs. recursively. We know c[k] = 1 (implicit) - * and start P_0 = x - p_0, then at each stage multiply by - * x - p_i generating P_i = x P_{i-1} - p_i P_{i-1} - * After k steps we are done. - */ - c[k - 1] = p[0]; /* really -p(0), but x = -x in GF(2^m) */ - for (i = 1; i < k; i++) { - gf p_i = p[i]; /* see above comment */ - for (j = k - 1 - (i - 1); j < k - 1; j++) - c[j] ^= gf_mul (p_i, c[j + 1]); - c[k - 1] ^= p_i; - } - - for (row = 0; row < k; row++) { - /* - * synthetic division etc. - */ - xx = p[row]; - t = 1; - b[k - 1] = 1; /* this is in fact c[k] */ - for (i = k - 1; i > 0; i--) { - b[i-1] = c[i] ^ gf_mul (xx, b[i]); - t = gf_mul (xx, t) ^ b[i-1]; - } - for (col = 0; col < k; col++) - src[col * k + row] = gf_mul (inverse[t], b[col]); - } - free (c); - free (b); - free (p); - return; -} - -static int fec_initialized = 0; -static void -init_fec (void) { - generate_gf(); - _init_mul_table(); - fec_initialized = 1; -} - -/* - * This section contains the proper FEC encoding/decoding routines. - * The encoding matrix is computed starting with a Vandermonde matrix, - * and then transforming it into a systematic matrix. - */ - -#define FEC_MAGIC 0xFECC0DEC - -void -fec_free (fec_t *p) { - if (p == NULL || - p->magic != (((FEC_MAGIC ^ p->k) ^ p->n) ^ (unsigned long) (p->enc_matrix))) { - ERR("bad parameters to fec_free"); - return; - } - free (p->enc_matrix); - free (p); -} - -fec_t * -fec_new(unsigned k, unsigned n) { - unsigned row, col; - gf *p, *tmp_m; - - fec_t *retval; - - fec_error[FEC_ERROR_SIZE] = '\0'; - - if (fec_initialized == 0) - init_fec (); - - retval = (fec_t *) my_malloc (sizeof (fec_t), "new_code"); - retval->k = k; - retval->n = n; - retval->enc_matrix = NEW_GF_MATRIX (n, k); - retval->magic = ((FEC_MAGIC ^ k) ^ n) ^ (unsigned long) (retval->enc_matrix); - tmp_m = NEW_GF_MATRIX (n, k); - /* - * fill the matrix with powers of field elements, starting from 0. - * The first row is special, cannot be computed with exp. table. - */ - tmp_m[0] = 1; - for (col = 1; col < k; col++) - tmp_m[col] = 0; - for (p = tmp_m + k, row = 0; row < n - 1; row++, p += k) - for (col = 0; col < k; col++) - p[col] = gf_exp[modnn (row * col)]; - - /* - * quick code to build systematic matrix: invert the top - * k*k vandermonde matrix, multiply right the bottom n-k rows - * by the inverse, and construct the identity matrix at the top. - */ - _invert_vdm (tmp_m, k); /* much faster than _invert_mat */ - _matmul(tmp_m + k * k, tmp_m, retval->enc_matrix + k * k, n - k, k, k); - /* - * the upper matrix is I so do not bother with a slow multiply - */ - memset (retval->enc_matrix, '\0', k * k * sizeof (gf)); - for (p = retval->enc_matrix, col = 0; col < k; col++, p += k + 1) - *p = 1; - free (tmp_m); - - return retval; -} - -void -fec_encode(const fec_t* code, const gf*restrict const*restrict const src, gf*restrict const*restrict const fecs, const unsigned*restrict const block_nums, size_t num_block_nums, size_t sz) { - unsigned char i, j; - unsigned fecnum; - gf* p; - - for (i=0; i= code->k); - memset(fecs[i], 0, sz); - p = &(code->enc_matrix[fecnum * code->k]); - for (j = 0; j < code->k; j++) - addmul(fecs[i], src[j], p[j], sz); - } -} - -/** - * Build decode matrix into some memory space. - * - * @param matrix a space allocated for a k by k matrix - */ -void -build_decode_matrix_into_space(const fec_t*restrict const code, const unsigned*const restrict index, const unsigned k, gf*restrict const matrix) { - unsigned char i; - gf* p; - for (i=0, p=matrix; i < k; i++, p += k) { - if (index[i] < k) { - memset(p, 0, k); - p[i] = 1; - } else { - memcpy(p, &(code->enc_matrix[index[i] * code->k]), k); - } - } - _invert_mat (matrix, k); -} - -void -fec_decode(const fec_t* code, const gf*restrict const*restrict const inpkts, gf*restrict const*restrict const outpkts, const unsigned*restrict const index, size_t sz) { - gf m_dec[code->k * code->k]; - build_decode_matrix_into_space(code, index, code->k, m_dec); - - unsigned char outix=0; - for (unsigned char row=0; rowk; row++) { - if (index[row] >= code->k) { - memset(outpkts[outix], 0, sz); - for (unsigned char col=0; col < code->k; col++) - addmul(outpkts[outix], inpkts[col], m_dec[row * code->k + col], sz); - outix++; - } - } -} diff --git a/pyfec/fec/fec.h b/pyfec/fec/fec.h deleted file mode 100644 index 2c689c1..0000000 --- a/pyfec/fec/fec.h +++ /dev/null @@ -1,101 +0,0 @@ -/** - * pyfec -- fast forward error correction library with Python interface - * - * Copyright (C) 2007 Allmydata, Inc. - * Author: Zooko Wilcox-O'Hearn - * mailto:zooko@zooko.com - * - * This file is part of pyfec. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License as published by the Free - * Software Foundation; either version 2 of the License, or (at your option) - * any later version. This program also comes with the added permission that, - * in the case that you are obligated to release a derived work under this - * licence (as per section 2.b of the GPL), you may delay the fulfillment of - * this obligation for up to 12 months. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. - */ - -/* - * Much of this work is derived from the "fec" software by Luigi Rizzo, et - * al., the copyright notice and licence terms of which are included below - * for reference. - * - * fec.h -- forward error correction based on Vandermonde matrices - * 980614 - * (C) 1997-98 Luigi Rizzo (luigi@iet.unipi.it) - * - * Portions derived from code by Phil Karn (karn@ka9q.ampr.org), - * Robert Morelos-Zaragoza (robert@spectra.eng.hawaii.edu) and Hari - * Thirumoorthy (harit@spectra.eng.hawaii.edu), Aug 1995 - * - * Modifications by Dan Rubenstein (see Modifications.txt for - * their description. - * Modifications (C) 1998 Dan Rubenstein (drubenst@cs.umass.edu) - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, - * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A - * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS - * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, - * OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, - * OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR - * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT - * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY - * OF SUCH DAMAGE. - */ - -typedef unsigned char gf; - -typedef struct { - unsigned long magic; - unsigned k, n; /* parameters of the code */ - gf* enc_matrix; -} fec_t; - -/** - * param k the number of blocks required to reconstruct - * param m the total number of blocks created - */ -fec_t* fec_new(unsigned k, unsigned m); -void fec_free(fec_t* p); - -/** - * @param inpkts the "primary blocks" i.e. the chunks of the input data - * @param fecs buffers into which the secondary blocks will be written - * @param block_nums the numbers of the desired blocks -- including both primary blocks (the id < k) which fec_encode() ignores and check blocks (the id >= k) which fec_encode() will produce and store into the buffers of the fecs parameter - * @param num_block_nums the length of the block_nums array - */ -void fec_encode(const fec_t* code, const gf*restrict const*restrict const src, gf*restrict const*restrict const fecs, const unsigned*restrict const block_nums, size_t num_block_nums, size_t sz); - -/** - * @param inpkts an array of packets (size k) - * @param outpkts an array of buffers into which the reconstructed output packets will be written (only packets which are not present in the inpkts input will be reconstructed and written to outpkts) - * @param index an array of the blocknums of the packets in inpkts - * @param sz size of a packet in bytes - */ -void fec_decode(const fec_t* code, const gf*restrict const*restrict const inpkts, gf*restrict const*restrict const outpkts, const unsigned*restrict const index, size_t sz); - -/* end of file */ diff --git a/pyfec/fec/filefec.py b/pyfec/fec/filefec.py deleted file mode 100644 index c8e3258..0000000 --- a/pyfec/fec/filefec.py +++ /dev/null @@ -1,436 +0,0 @@ -# pyfec -- fast forward error correction library with Python interface -# -# Copyright (C) 2007 Allmydata, Inc. -# Author: Zooko Wilcox-O'Hearn -# mailto:zooko@zooko.com -# -# This file is part of pyfec. -# -# This program is free software; you can redistribute it and/or modify it -# under the terms of the GNU General Public License as published by the Free -# Software Foundation; either version 2 of the License, or (at your option) -# any later version. This program also comes with the added permission that, -# in the case that you are obligated to release a derived work under this -# licence (as per section 2.b of the GPL), you may delay the fulfillment of -# this obligation for up to 12 months. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. - -import easyfec, fec -from util import fileutil -from util.mathutil import log_ceil - -import array, os, re, struct, traceback - -CHUNKSIZE = 4096 - -class InsufficientShareFilesError(fec.Error): - def __init__(self, k, kb, *args, **kwargs): - fec.Error.__init__(self, *args, **kwargs) - self.k = k - self.kb = kb - - def __repr__(self): - return "Insufficient share files -- %d share files are required to recover this file, but only %d were given" % (self.k, self.kb,) - - def __str__(self): - return self.__repr__() - -class CorruptedShareFilesError(fec.Error): - pass - -def _build_header(m, k, pad, sh): - """ - @param m: the total number of shares; 3 <= m <= 256 - @param k: the number of shares required to reconstruct; 2 <= k < m - @param pad: the number of bytes of padding added to the file before encoding; 0 <= pad < k - @param sh: the shnum of this share; 0 <= k < m - - @return: a string (which is hopefully short) encoding m, k, sh, and pad - """ - assert m >= 3 - assert m <= 2**8 - assert k >= 2 - assert k < m - assert pad >= 0 - assert pad < k - - assert sh >= 0 - assert sh < m - - bitsused = 0 - val = 0 - - val |= (m - 3) - bitsused += 8 # the first 8 bits always encode m - - kbits = log_ceil(m-2, 2) # num bits needed to store all possible values of k - val <<= kbits - bitsused += kbits - - val |= (k - 2) - - padbits = log_ceil(k, 2) # num bits needed to store all possible values of pad - val <<= padbits - bitsused += padbits - - val |= pad - - shnumbits = log_ceil(m, 2) # num bits needed to store all possible values of shnum - val <<= shnumbits - bitsused += shnumbits - - val |= sh - - assert bitsused >= 11 - assert bitsused <= 32 - - if bitsused <= 16: - val <<= (16-bitsused) - cs = struct.pack('>H', val) - assert cs[:-2] == '\x00' * (len(cs)-2) - return cs[-2:] - if bitsused <= 24: - val <<= (24-bitsused) - cs = struct.pack('>I', val) - assert cs[:-3] == '\x00' * (len(cs)-3) - return cs[-3:] - else: - val <<= (32-bitsused) - cs = struct.pack('>I', val) - assert cs[:-4] == '\x00' * (len(cs)-4) - return cs[-4:] - -def MASK(bits): - return (1<> b2_bits_left) + 2 - - shbits = log_ceil(m, 2) # num bits needed to store all possible values of shnum - padbits = log_ceil(k, 2) # num bits needed to store all possible values of pad - - val = byte & (~kbitmask) - - needed_padbits = padbits - b2_bits_left - if needed_padbits > 0: - ch = inf.read(1) - if not ch: - raise CorruptedShareFilesError("Share files were corrupted -- share file %r didn't have a complete metadata header at the front. Perhaps the file was truncated." % (inf.name,)) - byte = struct.unpack(">B", ch)[0] - val <<= 8 - val |= byte - needed_padbits -= 8 - assert needed_padbits <= 0 - extrabits = -needed_padbits - pad = val >> extrabits - val &= MASK(extrabits) - - needed_shbits = shbits - extrabits - if needed_shbits > 0: - ch = inf.read(1) - if not ch: - raise CorruptedShareFilesError("Share files were corrupted -- share file %r didn't have a complete metadata header at the front. Perhaps the file was truncated." % (inf.name,)) - byte = struct.unpack(">B", ch)[0] - val <<= 8 - val |= byte - needed_shbits -= 8 - assert needed_shbits <= 0 - - gotshbits = -needed_shbits - - sh = val >> gotshbits - - return (m, k, pad, sh,) - -FORMAT_FORMAT = "%%s.%%0%dd_%%0%dd%%s" -RE_FORMAT = "%s.[0-9]+_[0-9]+%s" -def encode_to_files(inf, fsize, dirname, prefix, k, m, suffix=".fec", overwrite=False, verbose=False): - """ - Encode inf, writing the shares to specially named, newly created files. - - @param fsize: calling read() on inf must yield fsize bytes of data and - then raise an EOFError - @param dirname: the name of the directory into which the sharefiles will - be written - """ - mlen = len(str(m)) - format = FORMAT_FORMAT % (mlen, mlen,) - - padbytes = fec.util.mathutil.pad_size(fsize, k) - - fns = [] - fs = [] - try: - for shnum in range(m): - hdr = _build_header(m, k, padbytes, shnum) - - fn = os.path.join(dirname, format % (prefix, shnum, m, suffix,)) - if verbose: - print "Creating share file %r..." % (fn,) - if overwrite: - f = open(fn, "wb") - else: - fd = os.open(fn, os.O_WRONLY|os.O_CREAT|os.O_EXCL) - f = os.fdopen(fd, "wb") - f.write(hdr) - fs.append(f) - fns.append(fn) - sumlen = [0] - def cb(blocks, length): - assert len(blocks) == len(fs) - oldsumlen = sumlen[0] - sumlen[0] += length - if verbose: - if int((float(oldsumlen) / fsize) * 10) != int((float(sumlen[0]) / fsize) * 10): - print str(int((float(sumlen[0]) / fsize) * 10) * 10) + "% ...", - - if sumlen[0] > fsize: - raise IOError("Wrong file size -- possibly the size of the file changed during encoding. Original size: %d, observed size at least: %s" % (fsize, sumlen[0],)) - for i in range(len(blocks)): - data = blocks[i] - fs[i].write(data) - length -= len(data) - - encode_file_stringy_easyfec(inf, cb, k, m, chunksize=4096) - except EnvironmentError, le: - print "Cannot complete because of exception: " - print le - print "Cleaning up..." - # clean up - while fs: - f = fs.pop() - f.close() ; del f - fn = fns.pop() - if verbose: - print "Cleaning up: trying to remove %r..." % (fn,) - fileutil.remove_if_possible(fn) - return 1 - if verbose: - print - print "Done!" - return 0 - -# Note: if you really prefer base-2 and you change this code, then please -# denote 2^20 as "MiB" instead of "MB" in order to avoid ambiguity. -# Thanks. -# http://en.wikipedia.org/wiki/Megabyte -MILLION_BYTES=10**6 - -def decode_from_files(outf, infiles, verbose=False): - """ - Decode from the first k files in infiles, writing the results to outf. - """ - assert len(infiles) >= 2 - infs = [] - shnums = [] - m = None - k = None - padlen = None - - byteswritten = 0 - for f in infiles: - (nm, nk, npadlen, shnum,) = _parse_header(f) - if not (m is None or m == nm): - raise CorruptedShareFilesError("Share files were corrupted -- share file %r said that m was %s but another share file previously said that m was %s" % (f.name, nm, m,)) - m = nm - if not (k is None or k == nk): - raise CorruptedShareFilesError("Share files were corrupted -- share file %r said that k was %s but another share file previously said that k was %s" % (f.name, nk, k,)) - if k > len(infiles): - raise InsufficientShareFilesError(k, len(infiles)) - k = nk - if not (padlen is None or padlen == npadlen): - raise CorruptedShareFilesError("Share files were corrupted -- share file %r said that pad length was %s but another share file previously said that pad length was %s" % (f.name, npadlen, padlen,)) - padlen = npadlen - - infs.append(f) - shnums.append(shnum) - - if len(infs) == k: - break - - dec = easyfec.Decoder(k, m) - - while True: - chunks = [ inf.read(CHUNKSIZE) for inf in infs ] - if [ch for ch in chunks if len(ch) != len(chunks[-1])]: - raise CorruptedShareFilesError("Share files were corrupted -- all share files are required to be the same length, but they weren't.") - - if len(chunks[-1]) == CHUNKSIZE: - # Then this was a full read, so we're still in the sharefiles. - resultdata = dec.decode(chunks, shnums, padlen=0) - outf.write(resultdata) - byteswritten += len(resultdata) - if verbose: - if ((byteswritten - len(resultdata)) / (10*MILLION_BYTES)) != (byteswritten / (10*MILLION_BYTES)): - print str(byteswritten / MILLION_BYTES) + " MB ...", - else: - # Then this was a short read, so we've reached the end of the sharefiles. - resultdata = dec.decode(chunks, shnums, padlen) - outf.write(resultdata) - return # Done. - if verbose: - print - print "Done!" - -def encode_file(inf, cb, k, m, chunksize=4096): - """ - Read in the contents of inf, encode, and call cb with the results. - - First, k "input blocks" will be read from inf, each input block being of - size chunksize. Then these k blocks will be encoded into m "result - blocks". Then cb will be invoked, passing a list of the m result blocks - as its first argument, and the length of the encoded data as its second - argument. (The length of the encoded data is always equal to k*chunksize, - until the last iteration, when the end of the file has been reached and - less than k*chunksize bytes could be read from the file.) This procedure - is iterated until the end of the file is reached, in which case the space - of the input blocks that is unused is filled with zeroes before encoding. - - Note that the sequence passed in calls to cb() contains mutable array - objects in its first k elements whose contents will be overwritten when - the next segment is read from the input file. Therefore the - implementation of cb() has to either be finished with those first k arrays - before returning, or if it wants to keep the contents of those arrays for - subsequent use after it has returned then it must make a copy of them to - keep. - - @param inf the file object from which to read the data - @param cb the callback to be invoked with the results - @param k the number of shares required to reconstruct the file - @param m the total number of shares created - @param chunksize how much data to read from inf for each of the k input - blocks - """ - enc = fec.Encoder(k, m) - l = tuple([ array.array('c') for i in range(k) ]) - indatasize = k*chunksize # will be reset to shorter upon EOF - eof = False - ZEROES=array.array('c', ['\x00'])*chunksize - while not eof: - # This loop body executes once per segment. - i = 0 - while (i= 3: - return "%s:%s" % (len(x), b32encode(x[-3:]),) - elif len(x) == 2: - return "%s:%s" % (len(x), b32encode(x[-2:]),) - elif len(x) == 1: - return "%s:%s" % (len(x), b32encode(x[-1:]),) - elif len(x) == 0: - return "%s:%s" % (len(x), "--empty--",) - -def _h(k, m, ss): - encer = fec.Encoder(k, m) - nums_and_blocks = list(enumerate(encer.encode(ss))) - assert isinstance(nums_and_blocks, list), nums_and_blocks - assert len(nums_and_blocks) == m, (len(nums_and_blocks), m,) - nums_and_blocks = random.sample(nums_and_blocks, k) - blocks = [ x[1] for x in nums_and_blocks ] - nums = [ x[0] for x in nums_and_blocks ] - decer = fec.Decoder(k, m) - decoded = decer.decode(blocks, nums) - assert len(decoded) == len(ss), (len(decoded), len(ss),) - assert tuple([str(s) for s in decoded]) == tuple([str(s) for s in ss]), (tuple([ab(str(s)) for s in decoded]), tuple([ab(str(s)) for s in ss]),) - -def randstr(n): - return ''.join(map(chr, map(random.randrange, [0]*n, [256]*n))) - -def _help_test_random(): - m = random.randrange(1, 257) - k = random.randrange(1, m+1) - l = random.randrange(0, 2**10) - ss = [ randstr(l/k) for x in range(k) ] - _h(k, m, ss) - -def _help_test_random_with_l(l): - m = 83 - k = 19 - ss = [ randstr(l/k) for x in range(k) ] - _h(k, m, ss) - -class Fec(unittest.TestCase): - def test_random(self): - for i in range(3): - _help_test_random() - if VERBOSE: - print "%d randomized tests pass." % (i+1) - - def test_bad_args_enc(self): - encer = fec.Encoder(2, 4) - try: - encer.encode(["a", "b", ], ["c", "I am not an integer blocknum",]) - except fec.Error, e: - assert "Precondition violation: second argument is required to contain int" in str(e), e - else: - raise "Should have gotten fec.Error for wrong type of second argument." - - try: - encer.encode(["a", "b", ], 98) # not a sequence at all - except TypeError, e: - assert "Second argument (optional) was not a sequence" in str(e), e - else: - raise "Should have gotten TypeError for wrong type of second argument." - - def test_bad_args_dec(self): - decer = fec.Decoder(2, 4) - - try: - decer.decode(98, [0, 1]) # first argument is not a sequence - except TypeError, e: - assert "First argument was not a sequence" in str(e), e - else: - raise "Should have gotten TypeError for wrong type of second argument." - - try: - decer.decode(["a", "b", ], ["c", "d",]) - except fec.Error, e: - assert "Precondition violation: second argument is required to contain int" in str(e), e - else: - raise "Should have gotten fec.Error for wrong type of second argument." - - try: - decer.decode(["a", "b", ], 98) # not a sequence at all - except TypeError, e: - assert "Second argument was not a sequence" in str(e), e - else: - raise "Should have gotten TypeError for wrong type of second argument." - -class FileFec(unittest.TestCase): - def test_filefec_header(self): - for m in [3, 5, 7, 9, 11, 17, 19, 33, 35, 65, 66, 67, 129, 130, 131, 254, 255, 256,]: - for k in [2, 3, 5, 9, 17, 33, 65, 129, 255,]: - if k >= m: - continue - for pad in [0, 1, k-1,]: - if pad >= k: - continue - for sh in [0, 1, m-1,]: - if sh >= m: - continue - h = fec.filefec._build_header(m, k, pad, sh) - hio = cStringIO.StringIO(h) - (rm, rk, rpad, rsh,) = fec.filefec._parse_header(hio) - assert (rm, rk, rpad, rsh,) == (m, k, pad, sh,), h - - def _help_test_filefec(self, teststr, k, m, numshs=None): - if numshs == None: - numshs = m - - TESTFNAME = "testfile.txt" - PREFIX = "test" - SUFFIX = ".fec" - - tempdir = fec.util.fileutil.NamedTemporaryDirectory(cleanup=False) - try: - tempfn = os.path.join(tempdir.name, TESTFNAME) - tempf = open(tempfn, 'wb') - tempf.write(teststr) - tempf.close() - fsize = os.path.getsize(tempfn) - assert fsize == len(teststr) - - # encode the file - fec.filefec.encode_to_files(open(tempfn, 'rb'), fsize, tempdir.name, PREFIX, k, m, SUFFIX, verbose=VERBOSE) - - # select some share files - RE=re.compile(fec.filefec.RE_FORMAT % (PREFIX, SUFFIX,)) - fns = os.listdir(tempdir.name) - sharefs = [ open(os.path.join(tempdir.name, fn), "rb") for fn in fns if RE.match(fn) ] - random.shuffle(sharefs) - del sharefs[numshs:] - - # decode from the share files - outf = open(os.path.join(tempdir.name, 'recovered-testfile.txt'), 'wb') - fec.filefec.decode_from_files(outf, sharefs, verbose=VERBOSE) - outf.close() - - tempfn = open(os.path.join(tempdir.name, 'recovered-testfile.txt'), 'rb') - recovereddata = tempfn.read() - assert recovereddata == teststr - finally: - tempdir.shutdown() - - def test_filefec_all_shares(self): - return self._help_test_filefec("Yellow Whirled!", 3, 8) - - def test_filefec_all_shares_with_padding(self, noisy=VERBOSE): - return self._help_test_filefec("Yellow Whirled!A", 3, 8) - - def test_filefec_min_shares_with_padding(self, noisy=VERBOSE): - return self._help_test_filefec("Yellow Whirled!A", 3, 8, numshs=3) - -if __name__ == "__main__": - if hasattr(unittest, 'main'): - unittest.main() - else: - sys.path.append(os.getcwd()) - mods = [] - fullname = os.path.realpath(os.path.abspath(__file__)) - for pathel in sys.path: - fullnameofpathel = os.path.realpath(os.path.abspath(pathel)) - if fullname.startswith(fullnameofpathel): - relname = fullname[len(fullnameofpathel):] - mod = (os.path.splitext(relname)[0]).replace(os.sep, '.').strip('.') - mods.append(mod) - - mods.sort(cmp=lambda x, y: cmp(len(x), len(y))) - mods.reverse() - for mod in mods: - cmdstr = "trial %s %s" % (' '.join(sys.argv[1:]), mod) - print cmdstr - if os.system(cmdstr) == 0: - break diff --git a/pyfec/fec/util/__init__.py b/pyfec/fec/util/__init__.py deleted file mode 100644 index e69de29..0000000 diff --git a/pyfec/fec/util/argparse.py b/pyfec/fec/util/argparse.py deleted file mode 100644 index 8b273b7..0000000 --- a/pyfec/fec/util/argparse.py +++ /dev/null @@ -1,1799 +0,0 @@ -# -*- coding: utf-8 -*- - -# Copyright © 2006 Steven J. Bethard . -# -# Redistribution and use in source and binary forms, with or without -# modification, are permitted under the terms of the 3-clause BSD -# license. No warranty expressed or implied. - -"""Command-line parsing library - -This module is an optparse-inspired command-line parsing library that: - -* handles both optional and positional arguments -* produces highly informative usage messages -* supports parsers that dispatch to sub-parsers - -The following is a simple usage example that sums integers from the -command-line and writes the result to a file: - - parser = argparse.ArgumentParser( - description='sum the integers at the command line') - parser.add_argument( - 'integers', metavar='int', nargs='+', type=int, - help='an integer to be summed') - parser.add_argument( - '--log', default=sys.stdout, type=argparse.FileType('w'), - help='the file where the sum should be written') - args = parser.parse_args() - args.log.write('%s' % sum(args.integers)) - args.log.close() - -The module contains the following public classes: - - ArgumentParser -- The main entry point for command-line parsing. As the - example above shows, the add_argument() method is used to populate - the parser with actions for optional and positional arguments. Then - the parse_args() method is invoked to convert the args at the - command-line into an object with attributes. - - ArgumentError -- The exception raised by ArgumentParser objects when - there are errors with the parser's actions. Errors raised while - parsing the command-line are caught by ArgumentParser and emitted - as command-line messages. - - FileType -- A factory for defining types of files to be created. As the - example above shows, instances of FileType are typically passed as - the type= argument of add_argument() calls. - - Action -- The base class for parser actions. Typically actions are - selected by passing strings like 'store_true' or 'append_const' to - the action= argument of add_argument(). However, for greater - customization of ArgumentParser actions, subclasses of Action may - be defined and passed as the action= argument. - - HelpFormatter, RawDescriptionHelpFormatter -- Formatter classes which - may be passed as the formatter_class= argument to the - ArgumentParser constructor. HelpFormatter is the default, while - RawDescriptionHelpFormatter tells the parser not to perform any - line-wrapping on description text. - -All other classes in this module are considered implementation details. -(Also note that HelpFormatter and RawDescriptionHelpFormatter are only -considered public as object names -- the API of the formatter objects is -still considered an implementation detail.) -""" - -import os as _os -import re as _re -import sys as _sys -import textwrap as _textwrap - -from gettext import gettext as _ - -SUPPRESS = '==SUPPRESS==' - -OPTIONAL = '?' -ZERO_OR_MORE = '*' -ONE_OR_MORE = '+' -PARSER = '==PARSER==' - -# ============================= -# Utility functions and classes -# ============================= - -class _AttributeHolder(object): - """Abstract base class that provides __repr__. - - The __repr__ method returns a string in the format: - ClassName(attr=name, attr=name, ...) - The attributes are determined either by a class-level attribute, - '_kwarg_names', or by inspecting the instance __dict__. - """ - - def __repr__(self): - type_name = type(self).__name__ - arg_strings = [] - for arg in self._get_args(): - arg_strings.append(repr(arg)) - for name, value in self._get_kwargs(): - arg_strings.append('%s=%r' % (name, value)) - return '%s(%s)' % (type_name, ', '.join(arg_strings)) - - def _get_kwargs(self): - return sorted(self.__dict__.items()) - - def _get_args(self): - return [] - -def _ensure_value(namespace, name, value): - if getattr(namespace, name, None) is None: - setattr(namespace, name, value) - return getattr(namespace, name) - - - -# =============== -# Formatting Help -# =============== - -class HelpFormatter(object): - - def __init__(self, - prog, - indent_increment=2, - max_help_position=24, - width=None): - - # default setting for width - if width is None: - try: - width = int(_os.environ['COLUMNS']) - except (KeyError, ValueError): - width = 80 - width -= 2 - - self._prog = prog - self._indent_increment = indent_increment - self._max_help_position = max_help_position - self._width = width - - self._current_indent = 0 - self._level = 0 - self._action_max_length = 0 - - self._root_section = self._Section(self, None) - self._current_section = self._root_section - - self._whitespace_matcher = _re.compile(r'\s+') - self._long_break_matcher = _re.compile(r'\n\n\n+') - - # =============================== - # Section and indentation methods - # =============================== - - def _indent(self): - self._current_indent += self._indent_increment - self._level += 1 - - def _dedent(self): - self._current_indent -= self._indent_increment - assert self._current_indent >= 0, 'Indent decreased below 0.' - self._level -= 1 - - class _Section(object): - def __init__(self, formatter, parent, heading=None): - self.formatter = formatter - self.parent = parent - self.heading = heading - self.items = [] - - def format_help(self): - # format the indented section - if self.parent is not None: - self.formatter._indent() - join = self.formatter._join_parts - for func, args in self.items: - func(*args) - item_help = join(func(*args) for func, args in self.items) - if self.parent is not None: - self.formatter._dedent() - - # return nothing if the section was empty - if not item_help: - return '' - - # add the heading if the section was non-empty - if self.heading is not SUPPRESS and self.heading is not None: - current_indent = self.formatter._current_indent - heading = '%*s%s:\n' % (current_indent, '', self.heading) - else: - heading = '' - - # join the section-initial newline, the heading and the help - return join(['\n', heading, item_help, '\n']) - - def _add_item(self, func, args): - self._current_section.items.append((func, args)) - - # ======================== - # Message building methods - # ======================== - - def start_section(self, heading): - self._indent() - section = self._Section(self, self._current_section, heading) - self._add_item(section.format_help, []) - self._current_section = section - - def end_section(self): - self._current_section = self._current_section.parent - self._dedent() - - def add_text(self, text): - if text is not SUPPRESS and text is not None: - self._add_item(self._format_text, [text]) - - def add_usage(self, usage, optionals, positionals, prefix=None): - if usage is not SUPPRESS: - args = usage, optionals, positionals, prefix - self._add_item(self._format_usage, args) - - def add_argument(self, action): - if action.help is not SUPPRESS: - - # update the maximum item length - invocation = self._format_action_invocation(action) - action_length = len(invocation) + self._current_indent - self._action_max_length = max(self._action_max_length, - action_length) - - # add the item to the list - self._add_item(self._format_action, [action]) - - def add_arguments(self, actions): - for action in actions: - self.add_argument(action) - - # ======================= - # Help-formatting methods - # ======================= - - def format_help(self): - help = self._root_section.format_help() % dict(prog=self._prog) - if help: - help = self._long_break_matcher.sub('\n\n', help) - help = help.strip('\n') + '\n' - return help - - def _join_parts(self, part_strings): - return ''.join(part - for part in part_strings - if part and part is not SUPPRESS) - - def _format_usage(self, usage, optionals, positionals, prefix): - if prefix is None: - prefix = _('usage: ') - - # if no optionals or positionals are available, usage is just prog - if usage is None and not optionals and not positionals: - usage = '%(prog)s' - - # if optionals and positionals are available, calculate usage - elif usage is None: - usage = '%(prog)s' % dict(prog=self._prog) - - # determine width of "usage: PROG" and width of text - prefix_width = len(prefix) + len(usage) + 1 - prefix_indent = self._current_indent + prefix_width - text_width = self._width - self._current_indent - - # put them on one line if they're short enough - format = self._format_actions_usage - action_usage = format(optionals + positionals) - if prefix_width + len(action_usage) + 1 < text_width: - usage = '%s %s' % (usage, action_usage) - - # if they're long, wrap optionals and positionals individually - else: - optional_usage = format(optionals) - positional_usage = format(positionals) - indent = ' ' * prefix_indent - - # usage is made of PROG, optionals and positionals - parts = [usage, ' '] - - # options always get added right after PROG - if optional_usage: - parts.append(_textwrap.fill( - optional_usage, text_width, - initial_indent=indent, - subsequent_indent=indent).lstrip()) - - # if there were options, put arguments on the next line - # otherwise, start them right after PROG - if positional_usage: - part = _textwrap.fill( - positional_usage, text_width, - initial_indent=indent, - subsequent_indent=indent).lstrip() - if optional_usage: - part = '\n' + indent + part - parts.append(part) - usage = ''.join(parts) - - # prefix with 'usage:' - return '%s%s\n\n' % (prefix, usage) - - def _format_actions_usage(self, actions): - parts = [] - for action in actions: - if action.help is SUPPRESS: - continue - - # produce all arg strings - if not action.option_strings: - parts.append(self._format_args(action, action.dest)) - - # produce the first way to invoke the option in brackets - else: - option_string = action.option_strings[0] - - # if the Optional doesn't take a value, format is: - # -s or --long - if action.nargs == 0: - part = '%s' % option_string - - # if the Optional takes a value, format is: - # -s ARGS or --long ARGS - else: - default = action.dest.upper() - args_string = self._format_args(action, default) - part = '%s %s' % (option_string, args_string) - - # make it look optional if it's not required - if not action.required: - part = '[%s]' % part - parts.append(part) - - return ' '.join(parts) - - def _format_text(self, text): - text_width = self._width - self._current_indent - indent = ' ' * self._current_indent - return self._fill_text(text, text_width, indent) + '\n\n' - - def _format_action(self, action): - # determine the required width and the entry label - help_position = min(self._action_max_length + 2, - self._max_help_position) - help_width = self._width - help_position - action_width = help_position - self._current_indent - 2 - action_header = self._format_action_invocation(action) - - # ho nelp; start on same line and add a final newline - if not action.help: - tup = self._current_indent, '', action_header - action_header = '%*s%s\n' % tup - - # short action name; start on the same line and pad two spaces - elif len(action_header) <= action_width: - tup = self._current_indent, '', action_width, action_header - action_header = '%*s%-*s ' % tup - indent_first = 0 - - # long action name; start on the next line - else: - tup = self._current_indent, '', action_header - action_header = '%*s%s\n' % tup - indent_first = help_position - - # collect the pieces of the action help - parts = [action_header] - - # if there was help for the action, add lines of help text - if action.help: - help_text = self._expand_help(action) - help_lines = self._split_lines(help_text, help_width) - parts.append('%*s%s\n' % (indent_first, '', help_lines[0])) - for line in help_lines[1:]: - parts.append('%*s%s\n' % (help_position, '', line)) - - # or add a newline if the description doesn't end with one - elif not action_header.endswith('\n'): - parts.append('\n') - - # return a single string - return self._join_parts(parts) - - def _format_action_invocation(self, action): - if not action.option_strings: - return self._format_metavar(action, action.dest) - - else: - parts = [] - - # if the Optional doesn't take a value, format is: - # -s, --long - if action.nargs == 0: - parts.extend(action.option_strings) - - # if the Optional takes a value, format is: - # -s ARGS, --long ARGS - else: - default = action.dest.upper() - args_string = self._format_args(action, default) - for option_string in action.option_strings: - parts.append('%s %s' % (option_string, args_string)) - - return ', '.join(parts) - - def _format_metavar(self, action, default_metavar): - if action.metavar is not None: - name = action.metavar - elif action.choices is not None: - choice_strs = (str(choice) for choice in action.choices) - name = '{%s}' % ','.join(choice_strs) - else: - name = default_metavar - return name - - def _format_args(self, action, default_metavar): - name = self._format_metavar(action, default_metavar) - if action.nargs is None: - result = name - elif action.nargs == OPTIONAL: - result = '[%s]' % name - elif action.nargs == ZERO_OR_MORE: - result = '[%s [%s ...]]' % (name, name) - elif action.nargs == ONE_OR_MORE: - result = '%s [%s ...]' % (name, name) - elif action.nargs is PARSER: - result = '%s ...' % name - else: - result = ' '.join([name] * action.nargs) - return result - - def _expand_help(self, action): - params = dict(vars(action), prog=self._prog) - for name, value in params.items(): - if value is SUPPRESS: - del params[name] - if params.get('choices') is not None: - choices_str = ', '.join(str(c) for c in params['choices']) - params['choices'] = choices_str - return action.help % params - - def _split_lines(self, text, width): - text = self._whitespace_matcher.sub(' ', text).strip() - return _textwrap.wrap(text, width) - - def _fill_text(self, text, width, indent): - text = self._whitespace_matcher.sub(' ', text).strip() - return _textwrap.fill(text, width, initial_indent=indent, - subsequent_indent=indent) - -class RawDescriptionHelpFormatter(HelpFormatter): - - def _fill_text(self, text, width, indent): - return ''.join(indent + line for line in text.splitlines(True)) - -class RawTextHelpFormatter(RawDescriptionHelpFormatter): - - def _split_lines(self, text, width): - return text.splitlines() - -# ===================== -# Options and Arguments -# ===================== - -class ArgumentError(Exception): - """ArgumentError(message, argument) - - Raised whenever there was an error creating or using an argument - (optional or positional). - - The string value of this exception is the message, augmented with - information about the argument that caused it. - """ - - def __init__(self, argument, message): - if argument.option_strings: - self.argument_name = '/'.join(argument.option_strings) - elif argument.metavar not in (None, SUPPRESS): - self.argument_name = argument.metavar - elif argument.dest not in (None, SUPPRESS): - self.argument_name = argument.dest - else: - self.argument_name = None - self.message = message - - def __str__(self): - if self.argument_name is None: - format = '%(message)s' - else: - format = 'argument %(argument_name)s: %(message)s' - return format % dict(message=self.message, - argument_name=self.argument_name) - -# ============== -# Action classes -# ============== - -class Action(_AttributeHolder): - """Action(*strings, **options) - - Action objects hold the information necessary to convert a - set of command-line arguments (possibly including an initial option - string) into the desired Python object(s). - - Keyword Arguments: - - option_strings -- A list of command-line option strings which - should be associated with this action. - - dest -- The name of the attribute to hold the created object(s) - - nargs -- The number of command-line arguments that should be consumed. - By default, one argument will be consumed and a single value will - be produced. Other values include: - * N (an integer) consumes N arguments (and produces a list) - * '?' consumes zero or one arguments - * '*' consumes zero or more arguments (and produces a list) - * '+' consumes one or more arguments (and produces a list) - Note that the difference between the default and nargs=1 is that - with the default, a single value will be produced, while with - nargs=1, a list containing a single value will be produced. - - const -- The value to be produced if the option is specified and the - option uses an action that takes no values. - - default -- The value to be produced if the option is not specified. - - type -- The type which the command-line arguments should be converted - to, should be one of 'string', 'int', 'float', 'complex' or a - callable object that accepts a single string argument. If None, - 'string' is assumed. - - choices -- A container of values that should be allowed. If not None, - after a command-line argument has been converted to the appropriate - type, an exception will be raised if it is not a member of this - collection. - - required -- True if the action must always be specified at the command - line. This is only meaningful for optional command-line arguments. - - help -- The help string describing the argument. - - metavar -- The name to be used for the option's argument with the help - string. If None, the 'dest' value will be used as the name. - """ - - - def __init__(self, - option_strings, - dest, - nargs=None, - const=None, - default=None, - type=None, - choices=None, - required=False, - help=None, - metavar=None): - self.option_strings = option_strings - self.dest = dest - self.nargs = nargs - self.const = const - self.default = default - self.type = type - self.choices = choices - self.required = required - self.help = help - self.metavar = metavar - - def _get_kwargs(self): - names = [ - 'option_strings', - 'dest', - 'nargs', - 'const', - 'default', - 'type', - 'choices', - 'help', - 'metavar' - ] - return [(name, getattr(self, name)) for name in names] - - def __call__(self, parser, namespace, values, option_string=None): - raise NotImplementedError(_('.__call__() not defined')) - -class _StoreAction(Action): - def __init__(self, - option_strings, - dest, - nargs=None, - const=None, - default=None, - type=None, - choices=None, - required=False, - help=None, - metavar=None): - if nargs == 0: - raise ValueError('nargs must be > 0') - if const is not None and nargs != OPTIONAL: - raise ValueError('nargs must be %r to supply const' % OPTIONAL) - super(_StoreAction, self).__init__( - option_strings=option_strings, - dest=dest, - nargs=nargs, - const=const, - default=default, - type=type, - choices=choices, - required=required, - help=help, - metavar=metavar) - - def __call__(self, parser, namespace, values, option_string=None): - setattr(namespace, self.dest, values) - -class _StoreConstAction(Action): - def __init__(self, - option_strings, - dest, - const, - default=None, - required=False, - help=None, - metavar=None): - super(_StoreConstAction, self).__init__( - option_strings=option_strings, - dest=dest, - nargs=0, - const=const, - default=default, - required=required, - help=help) - - def __call__(self, parser, namespace, values, option_string=None): - setattr(namespace, self.dest, self.const) - -class _StoreTrueAction(_StoreConstAction): - def __init__(self, - option_strings, - dest, - default=False, - required=False, - help=None): - super(_StoreTrueAction, self).__init__( - option_strings=option_strings, - dest=dest, - const=True, - default=default, - required=required, - help=help) - -class _StoreFalseAction(_StoreConstAction): - def __init__(self, - option_strings, - dest, - default=True, - required=False, - help=None): - super(_StoreFalseAction, self).__init__( - option_strings=option_strings, - dest=dest, - const=False, - default=default, - required=required, - help=help) - -class _AppendAction(Action): - def __init__(self, - option_strings, - dest, - nargs=None, - const=None, - default=None, - type=None, - choices=None, - required=False, - help=None, - metavar=None): - if nargs == 0: - raise ValueError('nargs must be > 0') - if const is not None and nargs != OPTIONAL: - raise ValueError('nargs must be %r to supply const' % OPTIONAL) - super(_AppendAction, self).__init__( - option_strings=option_strings, - dest=dest, - nargs=nargs, - const=const, - default=default, - type=type, - choices=choices, - required=required, - help=help, - metavar=metavar) - - def __call__(self, parser, namespace, values, option_string=None): - _ensure_value(namespace, self.dest, []).append(values) - -class _AppendConstAction(Action): - def __init__(self, - option_strings, - dest, - const, - default=None, - required=False, - help=None, - metavar=None): - super(_AppendConstAction, self).__init__( - option_strings=option_strings, - dest=dest, - nargs=0, - const=const, - default=default, - required=required, - help=help, - metavar=metavar) - - def __call__(self, parser, namespace, values, option_string=None): - _ensure_value(namespace, self.dest, []).append(self.const) - -class _CountAction(Action): - def __init__(self, - option_strings, - dest, - default=None, - required=False, - help=None): - super(_CountAction, self).__init__( - option_strings=option_strings, - dest=dest, - nargs=0, - default=default, - required=required, - help=help) - - def __call__(self, parser, namespace, values, option_string=None): - new_count = _ensure_value(namespace, self.dest, 0) + 1 - setattr(namespace, self.dest, new_count) - -class _HelpAction(Action): - def __init__(self, - option_strings, - dest, - help=None): - super(_HelpAction, self).__init__( - option_strings=option_strings, - dest=SUPPRESS, - nargs=0, - help=help) - - def __call__(self, parser, namespace, values, option_string=None): - parser.print_help() - parser.exit() - -class _VersionAction(Action): - def __init__(self, - option_strings, - dest, - help=None): - super(_VersionAction, self).__init__( - option_strings=option_strings, - dest=SUPPRESS, - nargs=0, - help=help) - - def __call__(self, parser, namespace, values, option_string=None): - parser.print_version() - parser.exit() - -class _SubParsersAction(Action): - - def __init__(self, - option_strings, - prog, - parser_class, - dest=SUPPRESS, - help=None, - metavar=None): - - self._prog_prefix = prog - self._parser_class = parser_class - self._name_parser_map = {} - - super(_SubParsersAction, self).__init__( - option_strings=option_strings, - dest=dest, - nargs=PARSER, - choices=self._name_parser_map, - help=help, - metavar=metavar) - - def add_parser(self, name, **kwargs): - if kwargs.get('prog') is None: - kwargs['prog'] = '%s %s' % (self._prog_prefix, name) - - parser = self._parser_class(**kwargs) - self._name_parser_map[name] = parser - return parser - - def __call__(self, parser, namespace, values, option_string=None): - parser_name = values[0] - arg_strings = values[1:] - - # set the parser name if requested - if self.dest is not SUPPRESS: - setattr(namespace, self.dest, parser_name) - - # select the parser - try: - parser = self._name_parser_map[parser_name] - except KeyError: - tup = parser_name, ', '.join(self._name_parser_map) - msg = _('unknown parser %r (choices: %s)' % tup) - raise ArgumentError(self, msg) - - # parse all the remaining options into the namespace - parser.parse_args(arg_strings, namespace) - - -# ============== -# Type classes -# ============== - -class FileType(object): - """Factory for creating file object types - - Instances of FileType are typically passed as type= arguments to the - ArgumentParser add_argument() method. - - Keyword Arguments: - mode -- A string indicating how the file is to be opened. Accepts the - same values as the builtin open() function. - bufsize -- The file's desired buffer size. Accepts the same values as - the builtin open() function. - exclusiveopen -- A bool indicating whether the attempt to create the file - should fail if there is already a file present by that name. This is - ignored if 'w' is not in mode. - """ - def __init__(self, mode='r', bufsize=None, exclusivecreate=False): - self._mode = mode - self._bufsize = bufsize - if self._bufsize is None: - self._bufsize = -1 - self._exclusivecreate = exclusivecreate - - def __call__(self, string): - # the special argument "-" means sys.std{in,out} - if string == '-': - if self._mode == 'r': - return _sys.stdin - elif self._mode == 'w': - return _sys.stdout - else: - msg = _('argument "-" with mode %r' % self._mode) - raise ValueError(msg) - - # all other arguments are used as file names - if self._exclusivecreate and ('w' in self._mode): - fd = _os.open(string, _os.O_CREAT|_os.O_EXCL) - return _os.fdopen(fd, self._mode, self._bufsize) - else: - return open(string, self._mode, self._bufsize) - - -# =========================== -# Optional and Positional Parsing -# =========================== - -class Namespace(_AttributeHolder): - - def __init__(self, **kwargs): - for name, value in kwargs.iteritems(): - setattr(self, name, value) - - def __eq__(self, other): - return vars(self) == vars(other) - - def __ne__(self, other): - return not (self == other) - - -class _ActionsContainer(object): - def __init__(self, - description, - conflict_handler): - superinit = super(_ActionsContainer, self).__init__ - superinit(description=description) - - self.description = description - self.conflict_handler = conflict_handler - - # set up registries - self._registries = {} - - # register actions - self.register('action', None, _StoreAction) - self.register('action', 'store', _StoreAction) - self.register('action', 'store_const', _StoreConstAction) - self.register('action', 'store_true', _StoreTrueAction) - self.register('action', 'store_false', _StoreFalseAction) - self.register('action', 'append', _AppendAction) - self.register('action', 'append_const', _AppendConstAction) - self.register('action', 'count', _CountAction) - self.register('action', 'help', _HelpAction) - self.register('action', 'version', _VersionAction) - self.register('action', 'parsers', _SubParsersAction) - - # raise an exception if the conflict handler is invalid - self._get_handler() - - # action storage - self._optional_actions_list = [] - self._positional_actions_list = [] - self._positional_actions_full_list = [] - self._option_strings = {} - - # ==================== - # Registration methods - # ==================== - - def register(self, registry_name, value, object): - registry = self._registries.setdefault(registry_name, {}) - registry[value] = object - - def _registry_get(self, registry_name, value, default=None): - return self._registries[registry_name].get(value, default) - - # ======================= - # Adding argument actions - # ======================= - - def add_argument(self, *args, **kwargs): - """ - add_argument(dest, ..., name=value, ...) - add_argument(option_string, option_string, ..., name=value, ...) - """ - - # type='outfile' is deprecated - if kwargs.get('type') == 'outfile': - import warnings - msg = _("use type=FileType('w') instead of type='outfile'") - warnings.warn(msg, DeprecationWarning) - - # if no positional args are supplied or only one is supplied and - # it doesn't look like an option string, parse a positional - # argument - if not args or len(args) == 1 and args[0][0] != '-': - kwargs = self._get_positional_kwargs(*args, **kwargs) - - # otherwise, we're adding an optional argument - else: - kwargs = self._get_optional_kwargs(*args, **kwargs) - - # create the action object, and add it to the parser - action_class = self._pop_action_class(kwargs) - action = action_class(**kwargs) - return self._add_action(action) - - def _add_action(self, action): - # resolve any conflicts - self._check_conflict(action) - - # add to optional or positional list - if action.option_strings: - self._optional_actions_list.append(action) - else: - self._positional_actions_list.append(action) - self._positional_actions_full_list.append(action) - action.container = self - - # index the action by any option strings it has - for option_string in action.option_strings: - self._option_strings[option_string] = action - - # return the created action - return action - - def _add_container_actions(self, container): - for action in container._optional_actions_list: - self._add_action(action) - for action in container._positional_actions_list: - self._add_action(action) - - def _get_positional_kwargs(self, dest, **kwargs): - # make sure required is not specified - if 'required' in kwargs: - msg = _("'required' is an invalid argument for positionals") - raise TypeError(msg) - - # return the keyword arguments with no option strings - return dict(kwargs, dest=dest, option_strings=[]) - - def _get_optional_kwargs(self, *args, **kwargs): - # determine short and long option strings - option_strings = [] - long_option_strings = [] - for option_string in args: - # error on one-or-fewer-character option strings - if len(option_string) < 2: - msg = _('invalid option string %r: ' - 'must be at least two characters long') - raise ValueError(msg % option_string) - - # error on strings that don't start with '-' - if not option_string.startswith('-'): - msg = _('invalid option string %r: ' - 'does not start with "-"') - raise ValueError(msg % option_string) - - # error on strings that are all '-'s - if not option_string.replace('-', ''): - msg = _('invalid option string %r: ' - 'must contain characters other than "-"') - raise ValueError(msg % option_string) - - # strings starting with '--' are long options - option_strings.append(option_string) - if option_string.startswith('--'): - long_option_strings.append(option_string) - - # infer destination, '--foo-bar' -> 'foo_bar' and '-x' -> 'x' - dest = kwargs.pop('dest', None) - if dest is None: - if long_option_strings: - dest_option_string = long_option_strings[0] - else: - dest_option_string = option_strings[0] - dest = dest_option_string.lstrip('-').replace('-', '_') - - # return the updated keyword arguments - return dict(kwargs, dest=dest, option_strings=option_strings) - - def _pop_action_class(self, kwargs, default=None): - action = kwargs.pop('action', default) - return self._registry_get('action', action, action) - - def _get_handler(self): - # determine function from conflict handler string - handler_func_name = '_handle_conflict_%s' % self.conflict_handler - try: - return getattr(self, handler_func_name) - except AttributeError: - msg = _('invalid conflict_resolution value: %r') - raise ValueError(msg % self.conflict_handler) - - def _check_conflict(self, action): - - # find all options that conflict with this option - confl_optionals = [] - for option_string in action.option_strings: - if option_string in self._option_strings: - confl_optional = self._option_strings[option_string] - confl_optionals.append((option_string, confl_optional)) - - # resolve any conflicts - if confl_optionals: - conflict_handler = self._get_handler() - conflict_handler(action, confl_optionals) - - def _handle_conflict_error(self, action, conflicting_actions): - message = _('conflicting option string(s): %s') - conflict_string = ', '.join(option_string - for option_string, action - in conflicting_actions) - raise ArgumentError(action, message % conflict_string) - - def _handle_conflict_resolve(self, action, conflicting_actions): - - # remove all conflicting options - for option_string, action in conflicting_actions: - - # remove the conflicting option - action.option_strings.remove(option_string) - self._option_strings.pop(option_string, None) - - # if the option now has no option string, remove it from the - # container holding it - if not action.option_strings: - action.container._optional_actions_list.remove(action) - - -class _ArgumentGroup(_ActionsContainer): - - def __init__(self, container, title=None, description=None, **kwargs): - # add any missing keyword arguments by checking the container - update = kwargs.setdefault - update('conflict_handler', container.conflict_handler) - superinit = super(_ArgumentGroup, self).__init__ - superinit(description=description, **kwargs) - - self.title = title - self._registries = container._registries - self._positional_actions_full_list = container._positional_actions_full_list - self._option_strings = container._option_strings - - -class ArgumentParser(_AttributeHolder, _ActionsContainer): - - def __init__(self, - prog=None, - usage=None, - description=None, - epilog=None, - version=None, - parents=[], - formatter_class=HelpFormatter, - conflict_handler='error', - add_help=True): - - superinit = super(ArgumentParser, self).__init__ - superinit(description=description, - conflict_handler=conflict_handler) - - # default setting for prog - if prog is None: - prog = _os.path.basename(_sys.argv[0]) - - self.prog = prog - self.usage = usage - self.epilog = epilog - self.version = version - self.formatter_class = formatter_class - self.add_help = add_help - - self._argument_group_class = _ArgumentGroup - self._has_subparsers = False - self._argument_groups = [] - self._defaults = {} - - # register types - def identity(string): - return string - def outfile(string): - if string == '-': - return _sys.stdout - else: - return open(string, 'w') - self.register('type', None, identity) - self.register('type', 'outfile', outfile) - - # add help and version arguments if necessary - if self.add_help: - self._add_help_argument() - if self.version: - self._add_version_argument() - - # add parent arguments and defaults - for parent in parents: - self._add_container_actions(parent) - try: - defaults = parent._defaults - except AttributeError: - pass - else: - self._defaults.update(defaults) - - # determines whether an "option" looks like a negative number - self._negative_number_matcher = _re.compile(r'^-\d+|-\d*.\d+$') - - - # ======================= - # Pretty __repr__ methods - # ======================= - - def _get_kwargs(self): - names = [ - 'prog', - 'usage', - 'description', - 'version', - 'formatter_class', - 'conflict_handler', - 'add_help', - ] - return [(name, getattr(self, name)) for name in names] - - # ================================== - # Namespace default settings methods - # ================================== - - def set_defaults(self, **kwargs): - self._defaults.update(kwargs) - - # ================================== - # Optional/Positional adding methods - # ================================== - - def add_argument_group(self, *args, **kwargs): - group = self._argument_group_class(self, *args, **kwargs) - self._argument_groups.append(group) - return group - - def add_subparsers(self, **kwargs): - if self._has_subparsers: - self.error(_('cannot have multiple subparser arguments')) - - # add the parser class to the arguments if it's not present - kwargs.setdefault('parser_class', type(self)) - - # prog defaults to the usage message of this parser, skipping - # optional arguments and with no "usage:" prefix - if kwargs.get('prog') is None: - formatter = self._get_formatter() - formatter.add_usage(self.usage, [], - self._get_positional_actions(), '') - kwargs['prog'] = formatter.format_help().strip() - - # create the parsers action and add it to the positionals list - parsers_class = self._pop_action_class(kwargs, 'parsers') - action = parsers_class(option_strings=[], **kwargs) - self._positional_actions_list.append(action) - self._positional_actions_full_list.append(action) - self._has_subparsers = True - - # return the created parsers action - return action - - def _add_container_actions(self, container): - super(ArgumentParser, self)._add_container_actions(container) - try: - groups = container._argument_groups - except AttributeError: - pass - else: - for group in groups: - new_group = self.add_argument_group( - title=group.title, - description=group.description, - conflict_handler=group.conflict_handler) - new_group._add_container_actions(group) - - def _get_optional_actions(self): - actions = [] - actions.extend(self._optional_actions_list) - for argument_group in self._argument_groups: - actions.extend(argument_group._optional_actions_list) - return actions - - def _get_positional_actions(self): - return list(self._positional_actions_full_list) - - def _add_help_argument(self): - self.add_argument('-h', '--help', action='help', - help=_('show this help message and exit')) - - def _add_version_argument(self): - self.add_argument('-v', '--version', action='version', - help=_("show program's version number and exit")) - - - # ===================================== - # Command line argument parsing methods - # ===================================== - - def parse_args(self, args=None, namespace=None): - # args default to the system args - if args is None: - args = _sys.argv[1:] - - # default Namespace built from parser defaults - if namespace is None: - namespace = Namespace() - - # add any action defaults that aren't present - optional_actions = self._get_optional_actions() - positional_actions = self._get_positional_actions() - for action in optional_actions + positional_actions: - if action.dest is not SUPPRESS: - if not hasattr(namespace, action.dest): - if action.default is not SUPPRESS: - default = action.default - if isinstance(action.default, basestring): - default = self._get_value(action, default) - setattr(namespace, action.dest, default) - - # add any parser defaults that aren't present - for dest, value in self._defaults.iteritems(): - if not hasattr(namespace, dest): - setattr(namespace, dest, value) - - # parse the arguments and exit if there are any errors - try: - result = self._parse_args(args, namespace) - except ArgumentError, err: - self.error(str(err)) - - # make sure all required optionals are present - for action in self._get_optional_actions(): - if action.required: - if getattr(result, action.dest, None) is None: - opt_strs = '/'.join(action.option_strings) - msg = _('option %s is required' % opt_strs) - self.error(msg) - - # return the parsed arguments - return result - - def _parse_args(self, arg_strings, namespace): - - # find all option indices, and determine the arg_string_pattern - # which has an 'O' if there is an option at an index, - # an 'A' if there is an argument, or a '-' if there is a '--' - option_string_indices = {} - arg_string_pattern_parts = [] - arg_strings_iter = iter(arg_strings) - for i, arg_string in enumerate(arg_strings_iter): - - # all args after -- are non-options - if arg_string == '--': - arg_string_pattern_parts.append('-') - for arg_string in arg_strings_iter: - arg_string_pattern_parts.append('A') - - # otherwise, add the arg to the arg strings - # and note the index if it was an option - else: - option_tuple = self._parse_optional(arg_string) - if option_tuple is None: - pattern = 'A' - else: - option_string_indices[i] = option_tuple - pattern = 'O' - arg_string_pattern_parts.append(pattern) - - # join the pieces together to form the pattern - arg_strings_pattern = ''.join(arg_string_pattern_parts) - - # converts arg strings to the appropriate and then takes the action - def take_action(action, argument_strings, option_string=None): - argument_values = self._get_values(action, argument_strings) - action(self, namespace, argument_values, option_string) - - # function to convert arg_strings into an optional action - def consume_optional(start_index): - - # determine the optional action and parse any explicit - # argument out of the option string - option_tuple = option_string_indices[start_index] - action, option_string, explicit_arg = option_tuple - - # loop because single-dash options can be chained - # (e.g. -xyz is the same as -x -y -z if no args are required) - match_argument = self._match_argument - action_tuples = [] - while True: - - # if we found no optional action, raise an error - if action is None: - self.error(_('no such option: %s') % option_string) - - # if there is an explicit argument, try to match the - # optional's string arguments to only this - if explicit_arg is not None: - arg_count = match_argument(action, 'A') - - # if the action is a single-dash option and takes no - # arguments, try to parse more single-dash options out - # of the tail of the option string - if arg_count == 0 and option_string[1] != '-': - action_tuples.append((action, [], option_string)) - option_string = '-' + explicit_arg - option_tuple = self._parse_optional(option_string) - if option_tuple[0] is None: - msg = _('ignored explicit argument %r') - raise ArgumentError(action, msg % explicit_arg) - - # set the action, etc. for the next loop iteration - action, option_string, explicit_arg = option_tuple - - # if the action expect exactly one argument, we've - # successfully matched the option; exit the loop - elif arg_count == 1: - stop = start_index + 1 - args = [explicit_arg] - action_tuples.append((action, args, option_string)) - break - - # error if a double-dash option did not use the - # explicit argument - else: - msg = _('ignored explicit argument %r') - raise ArgumentError(action, msg % explicit_arg) - - # if there is no explicit argument, try to match the - # optional's string arguments with the following strings - # if successful, exit the loop - else: - start = start_index + 1 - selected_patterns = arg_strings_pattern[start:] - arg_count = match_argument(action, selected_patterns) - stop = start + arg_count - args = arg_strings[start:stop] - action_tuples.append((action, args, option_string)) - break - - # add the Optional to the list and return the index at which - # the Optional's string args stopped - assert action_tuples - for action, args, option_string in action_tuples: - take_action(action, args, option_string) - return stop - - # the list of Positionals left to be parsed; this is modified - # by consume_positionals() - positionals = self._get_positional_actions() - - # function to convert arg_strings into positional actions - def consume_positionals(start_index): - # match as many Positionals as possible - match_partial = self._match_arguments_partial - selected_pattern = arg_strings_pattern[start_index:] - arg_counts = match_partial(positionals, selected_pattern) - - # slice off the appropriate arg strings for each Positional - # and add the Positional and its args to the list - for action, arg_count in zip(positionals, arg_counts): - args = arg_strings[start_index: start_index + arg_count] - start_index += arg_count - take_action(action, args) - - # slice off the Positionals that we just parsed and return the - # index at which the Positionals' string args stopped - positionals[:] = positionals[len(arg_counts):] - return start_index - - # consume Positionals and Optionals alternately, until we have - # passed the last option string - start_index = 0 - if option_string_indices: - max_option_string_index = max(option_string_indices) - else: - max_option_string_index = -1 - while start_index <= max_option_string_index: - - # consume any Positionals preceding the next option - next_option_string_index = min( - index - for index in option_string_indices - if index >= start_index) - if start_index != next_option_string_index: - positionals_end_index = consume_positionals(start_index) - - # only try to parse the next optional if we didn't consume - # the option string during the positionals parsing - if positionals_end_index > start_index: - start_index = positionals_end_index - continue - else: - start_index = positionals_end_index - - # if we consumed all the positionals we could and we're not - # at the index of an option string, there were unparseable - # arguments - if start_index not in option_string_indices: - msg = _('extra arguments found: %s') - extras = arg_strings[start_index:next_option_string_index] - self.error(msg % ' '.join(extras)) - - # consume the next optional and any arguments for it - start_index = consume_optional(start_index) - - # consume any positionals following the last Optional - stop_index = consume_positionals(start_index) - - # if we didn't consume all the argument strings, there were too - # many supplied - if stop_index != len(arg_strings): - extras = arg_strings[stop_index:] - self.error(_('extra arguments found: %s') % ' '.join(extras)) - - # if we didn't use all the Positional objects, there were too few - # arg strings supplied. - if positionals: - self.error(_('too few arguments')) - - # return the updated namespace - return namespace - - def _match_argument(self, action, arg_strings_pattern): - # match the pattern for this action to the arg strings - nargs_pattern = self._get_nargs_pattern(action) - match = _re.match(nargs_pattern, arg_strings_pattern) - - # raise an exception if we weren't able to find a match - if match is None: - nargs_errors = { - None:_('expected one argument'), - OPTIONAL:_('expected at most one argument'), - ONE_OR_MORE:_('expected at least one argument') - } - default = _('expected %s argument(s)') % action.nargs - msg = nargs_errors.get(action.nargs, default) - raise ArgumentError(action, msg) - - # return the number of arguments matched - return len(match.group(1)) - - def _match_arguments_partial(self, actions, arg_strings_pattern): - # progressively shorten the actions list by slicing off the - # final actions until we find a match - result = [] - for i in xrange(len(actions), 0, -1): - actions_slice = actions[:i] - pattern = ''.join(self._get_nargs_pattern(action) - for action in actions_slice) - match = _re.match(pattern, arg_strings_pattern) - if match is not None: - result.extend(len(string) for string in match.groups()) - break - - # return the list of arg string counts - return result - - def _parse_optional(self, arg_string): - # if it doesn't start with a '-', it was meant to be positional - if not arg_string.startswith('-'): - return None - - # if it's just dashes, it was meant to be positional - if not arg_string.strip('-'): - return None - - # if the option string is present in the parser, return the action - if arg_string in self._option_strings: - action = self._option_strings[arg_string] - return action, arg_string, None - - # search through all possible prefixes of the option string - # and all actions in the parser for possible interpretations - option_tuples = [] - prefix_tuples = self._get_option_prefix_tuples(arg_string) - for option_string in self._option_strings: - for option_prefix, explicit_arg in prefix_tuples: - if option_string.startswith(option_prefix): - action = self._option_strings[option_string] - tup = action, option_string, explicit_arg - option_tuples.append(tup) - break - - # if multiple actions match, the option string was ambiguous - if len(option_tuples) > 1: - options = ', '.join(opt_str for _, opt_str, _ in option_tuples) - tup = arg_string, options - self.error(_('ambiguous option: %s could match %s') % tup) - - # if exactly one action matched, this segmentation is good, - # so return the parsed action - elif len(option_tuples) == 1: - option_tuple, = option_tuples - return option_tuple - - # if it was not found as an option, but it looks like a negative - # number, it was meant to be positional - if self._negative_number_matcher.match(arg_string): - return None - - # it was meant to be an optional but there is no such option - # in this parser (though it might be a valid option in a subparser) - return None, arg_string, None - - def _get_option_prefix_tuples(self, option_string): - result = [] - - # option strings starting with '--' are only split at the '=' - if option_string.startswith('--'): - if '=' in option_string: - option_prefix, explicit_arg = option_string.split('=', 1) - else: - option_prefix = option_string - explicit_arg = None - tup = option_prefix, explicit_arg - result.append(tup) - - # option strings starting with '-' are split at all indices - else: - for first_index, char in enumerate(option_string): - if char != '-': - break - for i in xrange(len(option_string), first_index, -1): - tup = option_string[:i], option_string[i:] or None - result.append(tup) - - # return the collected prefix tuples - return result - - def _get_nargs_pattern(self, action): - # in all examples below, we have to allow for '--' args - # which are represented as '-' in the pattern - nargs = action.nargs - - # the default (None) is assumed to be a single argument - if nargs is None: - nargs_pattern = '(-*A-*)' - - # allow zero or one arguments - elif nargs == OPTIONAL: - nargs_pattern = '(-*A?-*)' - - # allow zero or more arguments - elif nargs == ZERO_OR_MORE: - nargs_pattern = '(-*[A-]*)' - - # allow one or more arguments - elif nargs == ONE_OR_MORE: - nargs_pattern = '(-*A[A-]*)' - - # allow one argument followed by any number of options or arguments - elif nargs is PARSER: - nargs_pattern = '(-*A[-AO]*)' - - # all others should be integers - else: - nargs_pattern = '(-*%s-*)' % '-*'.join('A' * nargs) - - # if this is an optional action, -- is not allowed - if action.option_strings: - nargs_pattern = nargs_pattern.replace('-*', '') - nargs_pattern = nargs_pattern.replace('-', '') - - # return the pattern - return nargs_pattern - - # ======================== - # Value conversion methods - # ======================== - - def _get_values(self, action, arg_strings): - # for everything but PARSER args, strip out '--' - if action.nargs is not PARSER: - arg_strings = [s for s in arg_strings if s != '--'] - - # optional argument produces a default when not present - if not arg_strings and action.nargs == OPTIONAL: - if action.option_strings: - value = action.const - else: - value = action.default - if isinstance(value, basestring): - value = self._get_value(action, value) - self._check_value(action, value) - - # when nargs='*' on a positional, if there were no command-line - # args, use the default if it is anything other than None - elif (not arg_strings and action.nargs == ZERO_OR_MORE and - not action.option_strings): - if action.default is not None: - value = action.default - else: - value = arg_strings - self._check_value(action, value) - - # single argument or optional argument produces a single value - elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]: - arg_string, = arg_strings - value = self._get_value(action, arg_string) - self._check_value(action, value) - - # PARSER arguments convert all values, but check only the first - elif action.nargs is PARSER: - value = list(self._get_value(action, v) for v in arg_strings) - self._check_value(action, value[0]) - - # all other types of nargs produce a list - else: - value = list(self._get_value(action, v) for v in arg_strings) - for v in value: - self._check_value(action, v) - - # return the converted value - return value - - def _get_value(self, action, arg_string): - type_func = self._registry_get('type', action.type, action.type) - if not callable(type_func): - msg = _('%r is not callable') - raise ArgumentError(action, msg % type_func) - - # convert the value to the appropriate type - try: - result = type_func(arg_string) - - # TypeErrors or ValueErrors indicate errors - except (TypeError, ValueError): - name = getattr(action.type, '__name__', repr(action.type)) - msg = _('invalid %s value: %r') - raise ArgumentError(action, msg % (name, arg_string)) - - # return the converted value - return result - - def _check_value(self, action, value): - # converted value must be one of the choices (if specified) - if action.choices is not None and value not in action.choices: - tup = value, ', '.join(map(repr, action.choices)) - msg = _('invalid choice: %r (choose from %s)') % tup - raise ArgumentError(action, msg) - - - - # ======================= - # Help-formatting methods - # ======================= - - def format_usage(self): - formatter = self._get_formatter() - formatter.add_usage(self.usage, - self._get_optional_actions(), - self._get_positional_actions()) - return formatter.format_help() - - def format_help(self): - formatter = self._get_formatter() - - # usage - formatter.add_usage(self.usage, - self._get_optional_actions(), - self._get_positional_actions()) - - # description - formatter.add_text(self.description) - - # positionals - formatter.start_section(_('positional arguments')) - formatter.add_arguments(self._positional_actions_list) - formatter.end_section() - - # optionals - formatter.start_section(_('optional arguments')) - formatter.add_arguments(self._optional_actions_list) - formatter.end_section() - - # user-defined groups - for argument_group in self._argument_groups: - formatter.start_section(argument_group.title) - formatter.add_text(argument_group.description) - formatter.add_arguments(argument_group._positional_actions_list) - formatter.add_arguments(argument_group._optional_actions_list) - formatter.end_section() - - # epilog - formatter.add_text(self.epilog) - - # determine help from format above - return formatter.format_help() - - def format_version(self): - formatter = self._get_formatter() - formatter.add_text(self.version) - return formatter.format_help() - - def _get_formatter(self): - return self.formatter_class(prog=self.prog) - - # ===================== - # Help-printing methods - # ===================== - - def print_usage(self, file=None): - self._print_message(self.format_usage(), file) - - def print_help(self, file=None): - self._print_message(self.format_help(), file) - - def print_version(self, file=None): - self._print_message(self.format_version(), file) - - def _print_message(self, message, file=None): - if message: - if file is None: - file = _sys.stderr - file.write(message) - - - # =============== - # Exiting methods - # =============== - - def exit(self, status=0, message=None): - if message: - _sys.stderr.write(message) - _sys.exit(status) - - def error(self, message): - """error(message: string) - - Prints a usage message incorporating the message to stderr and - exits. - - If you override this in a subclass, it should not return -- it - should either exit or raise an exception. - """ - self.print_usage(_sys.stderr) - self.exit(2, _('%s: error: %s\n') % (self.prog, message)) diff --git a/pyfec/fec/util/fileutil.py b/pyfec/fec/util/fileutil.py deleted file mode 100644 index ef75cc6..0000000 --- a/pyfec/fec/util/fileutil.py +++ /dev/null @@ -1,186 +0,0 @@ -# Copyright (c) 2000 Autonomous Zone Industries -# Copyright (c) 2002-2007 Bryce "Zooko" Wilcox-O'Hearn -# This file is licensed under the -# GNU Lesser General Public License v2.1. -# See the file COPYING or visit http://www.gnu.org/ for details. -# Portions snarfed out of the Python standard library. -# The du part is due to Jim McCoy. - -""" -Futz with files like a pro. -""" - -import exceptions, os, stat, tempfile, time - -try: - from twisted.python import log -except ImportError: - class DummyLog: - def msg(self, *args, **kwargs): - pass - log = DummyLog() - -def rename(src, dst, tries=4, basedelay=0.1): - """ Here is a superkludge to workaround the fact that occasionally on - Windows some other process (e.g. an anti-virus scanner, a local search - engine, etc.) is looking at your file when you want to delete or move it, - and hence you can't. The horrible workaround is to sit and spin, trying - to delete it, for a short time and then give up. - - With the default values of tries and basedelay this can block for less - than a second. - - @param tries: number of tries -- each time after the first we wait twice - as long as the previous wait - @param basedelay: how long to wait before the second try - """ - for i in range(tries-1): - try: - return os.rename(src, dst) - except EnvironmentError, le: - # XXX Tighten this to check if this is a permission denied error (possibly due to another Windows process having the file open and execute the superkludge only in this case. - log.msg("XXX KLUDGE Attempting to move file %s => %s; got %s; sleeping %s seconds" % (src, dst, le, basedelay,)) - time.sleep(basedelay) - basedelay *= 2 - return os.rename(src, dst) # The last try. - -def remove(f, tries=4, basedelay=0.1): - """ Here is a superkludge to workaround the fact that occasionally on - Windows some other process (e.g. an anti-virus scanner, a local search - engine, etc.) is looking at your file when you want to delete or move it, - and hence you can't. The horrible workaround is to sit and spin, trying - to delete it, for a short time and then give up. - - With the default values of tries and basedelay this can block for less - than a second. - - @param tries: number of tries -- each time after the first we wait twice - as long as the previous wait - @param basedelay: how long to wait before the second try - """ - try: - os.chmod(f, stat.S_IWRITE | stat.S_IEXEC | stat.S_IREAD) - except: - pass - for i in range(tries-1): - try: - return os.remove(f) - except EnvironmentError, le: - # XXX Tighten this to check if this is a permission denied error (possibly due to another Windows process having the file open and execute the superkludge only in this case. - if not os.path.exists(f): - return - log.msg("XXX KLUDGE Attempting to remove file %s; got %s; sleeping %s seconds" % (f, le, basedelay,)) - time.sleep(basedelay) - basedelay *= 2 - return os.remove(f) # The last try. - -class NamedTemporaryDirectory: - """ - This calls tempfile.mkdtemp(), stores the name of the dir in - self.name, and rmrf's the dir when it gets garbage collected or - "shutdown()". - """ - def __init__(self, cleanup=True, *args, **kwargs): - """ If cleanup, then the directory will be rmrf'ed when the object is shutdown. """ - self.cleanup = cleanup - self.name = tempfile.mkdtemp(*args, **kwargs) - - def __repr__(self): - return "<%s instance at %x %s>" % (self.__class__.__name__, id(self), self.name) - - def __str__(self): - return self.__repr__() - - def __del__(self): - try: - self.shutdown() - except: - import traceback - traceback.print_exc() - - def shutdown(self): - if self.cleanup and hasattr(self, 'name'): - rm_dir(self.name) - -def make_dirs(dirname, mode=0777, strictmode=False): - """ - A threadsafe and idempotent version of os.makedirs(). If the dir already - exists, do nothing and return without raising an exception. If this call - creates the dir, return without raising an exception. If there is an - error that prevents creation or if the directory gets deleted after - make_dirs() creates it and before make_dirs() checks that it exists, raise - an exception. - - @param strictmode if true, then make_dirs() will raise an exception if the - directory doesn't have the desired mode. For example, if the - directory already exists, and has a different mode than the one - specified by the mode parameter, then if strictmode is true, - make_dirs() will raise an exception, else it will ignore the - discrepancy. - """ - tx = None - try: - os.makedirs(dirname, mode) - except OSError, x: - tx = x - - if not os.path.isdir(dirname): - if tx: - raise tx - raise exceptions.IOError, "unknown error prevented creation of directory, or deleted the directory immediately after creation: %s" % dirname # careful not to construct an IOError with a 2-tuple, as that has a special meaning... - - tx = None - if hasattr(os, 'chmod'): - try: - os.chmod(dirname, mode) - except OSError, x: - tx = x - - if strictmode and hasattr(os, 'stat'): - s = os.stat(dirname) - resmode = stat.S_IMODE(s.st_mode) - if resmode != mode: - if tx: - raise tx - raise exceptions.IOError, "unknown error prevented setting correct mode of directory, or changed mode of the directory immediately after creation. dirname: %s, mode: %04o, resmode: %04o" % (dirname, mode, resmode,) # careful not to construct an IOError with a 2-tuple, as that has a special meaning... - -def rm_dir(dirname): - """ - A threadsafe and idempotent version of shutil.rmtree(). If the dir is - already gone, do nothing and return without raising an exception. If this - call removes the dir, return without raising an exception. If there is an - error that prevents deletion or if the directory gets created again after - rm_dir() deletes it and before rm_dir() checks that it is gone, raise an - exception. - """ - excs = [] - try: - os.chmod(dirname, stat.S_IWRITE | stat.S_IEXEC | stat.S_IREAD) - for f in os.listdir(dirname): - fullname = os.path.join(dirname, f) - if os.path.isdir(fullname): - rm_dir(fullname) - else: - remove(fullname) - os.rmdir(dirname) - except Exception, le: - # Ignore "No such file or directory" - if (not isinstance(le, OSError)) or le.args[0] != 2: - excs.append(le) - - # Okay, now we've recursively removed everything, ignoring any "No - # such file or directory" errors, and collecting any other errors. - - if os.path.exists(dirname): - if len(excs) == 1: - raise excs[0] - if len(excs) == 0: - raise OSError, "Failed to remove dir for unknown reason." - raise OSError, excs - - -def remove_if_possible(f): - try: - remove(f) - except: - pass diff --git a/pyfec/fec/util/mathutil.py b/pyfec/fec/util/mathutil.py deleted file mode 100644 index de95a6a..0000000 --- a/pyfec/fec/util/mathutil.py +++ /dev/null @@ -1,92 +0,0 @@ -# Copyright (c) 2005-2007 Bryce "Zooko" Wilcox-O'Hearn -# mailto:zooko@zooko.com -# http://zooko.com/repos/pyutil -# Permission is hereby granted, free of charge, to any person obtaining a copy -# of this work to deal in this work without restriction (including the rights -# to use, modify, distribute, sublicense, and/or sell copies). - -""" -A few commonly needed functions. -""" - -import math - -def div_ceil(n, d): - """ - The smallest integer k such that k*d >= n. - """ - return (n/d) + (n%d != 0) - -def next_multiple(n, k): - """ - The smallest multiple of k which is >= n. - """ - return div_ceil(n, k) * k - -def pad_size(n, k): - """ - The smallest number that has to be added to n so that n is a multiple of k. - """ - if n%k: - return k - n%k - else: - return 0 - -def is_power_of_k(n, k): - return k**int(math.log(n, k) + 0.5) == n - -def next_power_of_k(n, k): - p = 1 - while p < n: - p *= k - return p - -def ave(l): - return sum(l) / len(l) - -def log_ceil(n, b): - """ - The smallest integer k such that b^k >= n. - - log_ceil(n, 2) is the number of bits needed to store any of n values, e.g. - the number of bits needed to store any of 128 possible values is 7. - """ - p = 1 - k = 0 - while p < n: - p *= b - k += 1 - return k - -def linear_fit_slope(ps): - """ - @param ps a sequence of tuples of (x, y) - """ - avex = ave([x for (x, y) in ps]) - avey = ave([y for (x, y) in ps]) - sxy = sum([ (x - avex) * (y - avey) for (x, y) in ps ]) - sxx = sum([ (x - avex) ** 2 for (x, y) in ps ]) - if sxx == 0: - return None - return sxy / sxx - -def permute(l): - """ - Return all possible permutations of l. - - @type l: sequence - @rtype a set of sequences - """ - if len(l) == 1: - return [l,] - - res = [] - for i in range(len(l)): - l2 = list(l[:]) - x = l2.pop(i) - for l3 in permute(l2): - l3.append(x) - res.append(l3) - - return res - diff --git a/pyfec/fec/util/version.py b/pyfec/fec/util/version.py deleted file mode 100644 index d5cbb36..0000000 --- a/pyfec/fec/util/version.py +++ /dev/null @@ -1,129 +0,0 @@ -# Copyright (c) 2004-2007 Bryce "Zooko" Wilcox-O'Hearn -# mailto:zooko@zooko.com -# http://zooko.com/repos/pyutil -# Permission is hereby granted, free of charge, to any person obtaining a copy -# of this work to deal in this work without restriction (including the rights -# to use, modify, distribute, sublicense, and/or sell copies). - -""" -extended version number class -""" - -from distutils import version - -# End users see version strings like this: - -# "1.0.0" -# ^ ^ ^ -# | | | -# | | '- micro version number -# | '- minor version number -# '- major version number - -# The first number is "major version number". The second number is the "minor -# version number" -- it gets bumped whenever we make a new release that adds or -# changes functionality. The third version is the "micro version number" -- it -# gets bumped whenever we make a new release that doesn't add or change -# functionality, but just fixes bugs (including performance issues). - -# Early-adopter end users see version strings like this: - -# "1.0.0a1" -# ^ ^ ^^^ -# | | ||| -# | | ||'- release number -# | | |'- alpha or beta (or none) -# | | '- micro version number -# | '- minor version number -# '- major version number - -# The optional "a" or "b" stands for "alpha release" or "beta release" -# respectively. The number after "a" or "b" gets bumped every time we -# make a new alpha or beta release. This has the same form and the same -# meaning as version numbers of releases of Python. - -# Developers see "full version strings", like this: - -# "1.0.0a1-55-UNSTABLE" -# ^ ^ ^^^ ^ ^ -# | | ||| | | -# | | ||| | '- tags -# | | ||| '- nano version number -# | | ||'- release number -# | | |'- alpha or beta (or none) -# | | '- micro version number -# | '- minor version number -# '- major version number - -# The next number is the "nano version number". It is meaningful only to -# developers. It gets bumped whenever a developer changes anything that another -# developer might care about. - -# The last part is the "tags" separated by "_". Standard tags are -# "STABLE" and "UNSTABLE". - -class Tag(str): - def __cmp__(t1, t2): - if t1 == t2: - return 0 - if t1 == "UNSTABLE" and t2 == "STABLE": - return 1 - if t1 == "STABLE" and t2 == "UNSTABLE": - return -1 - return -2 # who knows - -class Version: - def __init__(self, vstring=None): - if vstring: - self.parse(vstring) - - def parse(self, vstring): - i = vstring.find('-') - if i: - svstring = vstring[:i] - estring = vstring[i+1:] - else: - svstring = vstring - estring = None - - self.strictversion = version.StrictVersion(svstring) - - if estring: - try: - (self.nanovernum, tags,) = estring.split('-') - except: - print estring - raise - self.tags = map(Tag, tags.split('_')) - self.tags.sort() - - self.fullstr = '-'.join([str(self.strictversion), str(self.nanovernum), '_'.join(self.tags)]) - - def tags(self): - return self.tags - - def user_str(self): - return self.strictversion.__str__() - - def full_str(self): - return self.fullstr - - def __str__(self): - return self.full_str() - - def __repr__(self): - return self.__str__() - - def __cmp__ (self, other): - if isinstance(other, basestring): - other = Version(other) - - res = cmp(self.strictversion, other.strictversion) - if res != 0: - return res - - res = cmp(self.nanovernum, other.nanovernum) - if res != 0: - return res - - return cmp(self.tags, other.tags) diff --git a/pyfec/setup.py b/pyfec/setup.py deleted file mode 100755 index bd1321f..0000000 --- a/pyfec/setup.py +++ /dev/null @@ -1,69 +0,0 @@ -#!/usr/bin/env python - -# pyfec -- fast forward error correction library with Python interface -# -# Copyright (C) 2007 Allmydata, Inc. -# Author: Zooko Wilcox-O'Hearn -# mailto:zooko@zooko.com -# -# This file is part of pyfec. -# -# This program is free software; you can redistribute it and/or -# modify it under the terms of the GNU General Public License -# as published by the Free Software Foundation; either version 2 -# of the License, or (at your option) any later version. -# -# This program is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. - -from distutils.core import Extension, setup - - -DEBUGMODE=False -# DEBUGMODE=True - -extra_compile_args=[] -extra_link_args=[] - -extra_compile_args.append("-std=c99") - -undef_macros=[] - -if DEBUGMODE: - extra_compile_args.append("-O0") - extra_compile_args.append("-g") - extra_compile_args.append("-Wall") - extra_link_args.append("-g") - undef_macros.append('NDEBUG') - -trove_classifiers=[ - "Development Status :: 4 - Beta", - "Environment :: No Input/Output (Daemon)", - "Intended Audience :: Developers", - "License :: OSI Approved :: GNU General Public License (GPL)", - "Natural Language :: English", - "Operating System :: OS Independent", - "Programming Language :: C", - "Programming Language :: Python", - "Topic :: System :: Archiving :: Backup", - ] - -setup(name='pyfec', - version='1.0.0a1', - summary='Provides a fast C implementation of Reed-Solomon erasure coding with a Python interface.', - description='Erasure coding is the generation of redundant blocks of information such that if some blocks are lost ("erased") then the original data can be recovered from the remaining blocks. This package contains an optimized implementation along with a Python interface.', - author='Zooko O\'Whielacronx', - author_email='zooko@zooko.com', - url='http://www.allmydata.com/source/pyfec', - license='GNU GPL', - platform='Any', - packages=['fec', 'fec.util', 'fec.test'], - classifiers=trove_classifiers, - ext_modules=[Extension('_fec', ['fec/fec.c', 'fec/_fecmodule.c',], extra_link_args=extra_link_args, extra_compile_args=extra_compile_args, undef_macros=undef_macros),], - ) diff --git a/zfec/COPYING b/zfec/COPYING new file mode 100644 index 0000000..d7e9f2a --- /dev/null +++ b/zfec/COPYING @@ -0,0 +1,346 @@ +In addition to the terms of the GNU General Public License, the zfec package +also comes with a special added permission that you may delay for up to 12 +months the fulfillment of your obligation under GPL section 2.b to release any +derived work under this licence. + + + + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write to the Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. diff --git a/zfec/README.txt b/zfec/README.txt new file mode 100644 index 0000000..70d1b5b --- /dev/null +++ b/zfec/README.txt @@ -0,0 +1,215 @@ + * Intro and Licence + +This package implements an "erasure code", or "forward error correction code". +It is offered under the GNU General Public License v2 or (at your option) any +later version. This package also comes with the added permission that, in the +case that you are obligated to release a derived work under this licence (as +per section 2.b of the GPL), you may delay the fulfillment of this obligation +for up to 12 months. + +The most widely known example of an erasure code is the RAID-5 algorithm which +makes it so that in the event of the loss of any one hard drive, the stored +data can be completely recovered. The algorithm in the zfec package has a +similar effect, but instead of recovering from the loss of only a single +element, it can be parameterized to choose in advance the number of elements +whose loss it can tolerate. + +This package is largely based on the old "fec" library by Luigi Rizzo et al., +which is a mature and optimized implementation of erasure coding. The zfec +package makes several changes from the original "fec" package, including +addition of the Python API, refactoring of the C API to support zero-copy +operation, a few clean-ups and micro-optimizations of the core code itself, +and the addition of a command-line tool named "fec". + + + * Community + +The source is currently available via darcs on the web with the command: + +darcs get http://www.allmydata.com/source/zfec + +More information on darcs is available at http://darcs.net + +Please join the zfec mailing list and submit patches: + + + + + * Overview + +This package performs two operations, encoding and decoding. Encoding takes +some input data and expands its size by producing extra "check blocks", also +called "secondary blocks". Decoding takes some data -- any combination of +blocks of the original data (called "primary blocks") and "secondary blocks", +and produces the original data. + +The encoding is parameterized by two integers, k and m. m is the total number +of blocks produced, and k is how many of those blocks are necessary to +reconstruct the original data. m is required to be at least 1 and at most 256, +and k is required to be at least 1 and at most m. + +(Note that when k == m then there is no point in doing erasure coding -- it +degenerates to the equivalent of the Unix "split" utility which simply splits +the input into successive segments. Similarly, when k == 1 it degenerates to +the equivalent of the unix "cp" utility -- each block is a complete copy of the +input data. The "fec" command-line tool does not implement these degenerate +cases.) + +Note that each "primary block" is a segment of the original data, so its size +is 1/k'th of the size of original data, and each "secondary block" is of the +same size, so the total space used by all the blocks is m/k times the size of +the original data (plus some padding to fill out the last primary block to be +the same size as all the others). In addition to the data contained in the +blocks themselves there are also a few pieces of metadata which are necessary +for later reconstruction. Those pieces are: 1. the value of K, 2. the value +of M, 3. the sharenum of each block, 4. the number of bytes of padding +that were used. The "fec" command-line tool compresses these pieces of data +and prepends them to the beginning of each share, so each the sharefile +produced by the "fec" command-line tool is between one and four bytes larger +than the share data alone. + +The decoding step requires as input k of the blocks which were produced by the +encoding step. The decoding step produces as output the data that was earlier +input to the encoding step. + + + * Command-Line Tool + +The bin/ directory contains two Unix-style, command-line tools "fec" and +"unfec". Execute "fec --help" or "unfec --help" for usage instructions. + +Note: a Unix-style tool like "fec" does only one thing -- in this case +erasure coding -- and leaves other tasks to other tools. Other Unix-style +tools that go well with fec include "GNU tar" for archiving multiple files and +directories into one file, "rzip" for compression, "GNU Privacy Guard" for +encryption, and "sha256sum" for integrity. It is important to do things in +order: first archive, then compress, then either encrypt or sha256sum, then +erasure code. Note that if GNU Privacy Guard is used for privacy, then it will +also ensure integrity, so the use of sha256sum is unnecessary in that case. + + + * Performance Measurements + +On my Athlon 64 2.4 GHz workstation (running Linux), the "fec" command-line +tool encoded a 160 MB file with m=100, k=94 (about 6% redundancy) in 3.9 +seconds, where the "par2" tool encoded the file with about 6% redundancy in +27 seconds. "fec" encoded the same file with m=12, k=6 (100% redundancy) in +4.1 seconds, where par2 encoded it with about 100% redundancy in 7 minutes +and 56 seconds. + +The underlying C library in benchmark mode encoded from a file at about +4.9 million bytes per second and decoded at about 5.8 million bytes per second. + +On Peter's fancy Intel Mac laptop (2.16 GHz Core Duo), it encoded from a file +at about 6.2 million bytes per second. + +On my even fancier Intel Mac laptop (2.33 GHz Core Duo), it encoded from a file +at about 6.8 million bytes per second. + +On my old PowerPC G4 867 MHz Mac laptop, it encoded from a file at about 1.3 +million bytes per second. + + + * API + +Each block is associated with "blocknum". The blocknum of each primary block is +its index (starting from zero), so the 0'th block is the first primary block, +which is the first few bytes of the file, the 1'st block is the next primary +block, which is the next few bytes of the file, and so on. The last primary +block has blocknum k-1. The blocknum of each secondary block is an arbitrary +integer between k and 255 inclusive. (When using the Python API, if you don't +specify which blocknums you want for your secondary blocks when invoking +encode(), then it will by default provide the blocks with ids from k to m-1 +inclusive.) + + ** C API + +fec_encode() takes as input an array of k pointers, where each pointer points +to a memory buffer containing the input data (i.e., the i'th buffer contains +the i'th primary block). There is also a second parameter which is an array of +the blocknums of the secondary blocks which are to be produced. (Each element +in that array is required to be the blocknum of a secondary block, i.e. it is +required to be >= k and < m.) + +The output from fec_encode() is the requested set of secondary blocks which are +written into output buffers provided by the caller. + +fec_decode() takes as input an array of k pointers, where each pointer points +to a buffer containing a block. There is also a separate input parameter which +is an array of blocknums, indicating the blocknum of each of the blocks which is +being passed in. + +The output from fec_decode() is the set of primary blocks which were missing +from the input and had to be reconstructed. These reconstructed blocks are +written into putput buffers provided by the caller. + + ** Python API + +encode() and decode() take as input a sequence of k buffers, where a "sequence" +is any object that implements the Python sequence protocol (such as a list or +tuple) and a "buffer" is any object that implements the Python buffer protocol +(such as a string or array). The contents that are required to be present in +these buffers are the same as for the C API. + +encode() also takes a list of desired blocknums. Unlike the C API, the Python +API accepts blocknums of primary blocks as well as secondary blocks in its list +of desired blocknums. encode() returns a list of buffer objects which contain +the blocks requested. For each requested block which is a primary block, the +resulting list contains a reference to the apppropriate primary block from the +input list. For each requested block which is a secondary block, the list +contains a newly created string object containing that block. + +decode() also takes a list of integers indicating the blocknums of the blocks +being passed int. decode() returns a list of buffer objects which contain all +of the primary blocks of the original data (in order). For each primary block +which was present in the input list, then the result list simply contains a +reference to the object that was passed in the input list. For each primary +block which was not present in the input, the result list contains a newly +created string object containing that primary block. + +Beware of a "gotcha" that can result from the combination of mutable data and +the fact that the Python API returns references to inputs when possible. + +Returning references to its inputs is efficient since it avoids making an +unnecessary copy of the data, but if the object which was passed as input is +mutable and if that object is mutated after the call to zfec returns, then the +result from zfec -- which is just a reference to that same object -- will also +be mutated. This subtlety is the price you pay for avoiding data copying. If +you don't want to have to worry about this then you can simply use immutable +objects (e.g. Python strings) to hold the data that you pass to zfec. + + + * Utilities + +The filefec.py module has a utility function for efficiently reading a file +and encoding it piece by piece. This module is used by the "fec" and "unfec" +command-line tools from the bin/ directory. + + + * Dependencies + +A C compiler is required. To use the Python API or the command-line tools a +Python interpreter is also required. We have tested it with Python v2.4 and +v2.5. + + + * Acknowledgements + +Thanks to the author of the original fec lib, Luigi Rizzo, and the folks that +contributed to it: Phil Karn, Robert Morelos-Zaragoza, Hari Thirumoorthy, and +Dan Rubenstein. Thanks to the Mnet hackers who wrote an earlier Python +wrapper, especially Myers Carpenter and Hauke Johannknecht. Thanks to Brian +Warner and Amber O'Whielacronx for help with the API, documentation, +debugging, compression, and unit tests. Thanks to the creators of GCC +(starting with Richard M. Stallman) and Valgrind (starting with Julian Seward) +for a pair of excellent tools. Thanks to my coworkers at Allmydata -- +http://allmydata.com -- Fabrice Grinda, Peter Secor, Rob Kinninmont, Brian +Warner, Zandr Milewski, Justin Boreta, Mark Meras for sponsoring this work and +releasing it under a Free Software licence. + + +Enjoy! + +Zooko Wilcox-O'Hearn +2007-04-14 +Boulder, Colorado diff --git a/zfec/TODO b/zfec/TODO new file mode 100644 index 0000000..30f93ff --- /dev/null +++ b/zfec/TODO @@ -0,0 +1,5 @@ + * try Duff's device in _addmul1()? + * make sure it compiles with cygwin gcc -mno-cygwin + * memory usage analysis + * announce on pypi, sf.net, freshmeat, lwn, p2p-hackers + * try compiling with Microsoft compiler diff --git a/zfec/bin/fec b/zfec/bin/fec new file mode 100755 index 0000000..6ff1956 --- /dev/null +++ b/zfec/bin/fec @@ -0,0 +1,55 @@ +#!/usr/bin/env python + +# import bindann +# import bindann.monkeypatch.all + +import sys + +import fec +from fec.util import argparse +from fec import filefec +from fec.util.version import Version +__version__ = Version("1.0.0a1-0-STABLE") + +if '-V' in sys.argv or '--version' in sys.argv: + print "zfec library version: ", fec.__version__ + print "fec command-line tool version: ", __version__ + sys.exit(0) + +parser = argparse.ArgumentParser(description="Encode a file into a set of share files, a subset of which can later be used to recover the original file.") + +parser.add_argument('inputfile', help='file to encode or "-" for stdin', type=argparse.FileType('rb'), metavar='INF') +parser.add_argument('-d', '--output-dir', help='directory in which share file names will be created (default ".")', default='.', metavar='D') +parser.add_argument('-p', '--prefix', help='prefix for share file names; If omitted, the name of the input file will be used.', metavar='P') +parser.add_argument('-s', '--suffix', help='suffix for share file names (default ".fec")', default='.fec', metavar='S') +parser.add_argument('-m', '--totalshares', help='the total number of share files created (default 16)', default=16, type=int, metavar='M') +parser.add_argument('-k', '--requiredshares', help='the number of share files required to reconstruct (default 4)', default=4, type=int, metavar='K') +parser.add_argument('-f', '--force', help='overwrite any file which already in place an output file (share file)', action='store_true') +parser.add_argument('-v', '--verbose', help='print out messages about progress', action='store_true') +parser.add_argument('-V', '--version', help='print out version number and exit', action='store_true') +args = parser.parse_args() + +if args.prefix is None: + args.prefix = args.inputfile.name + if args.prefix == "": + args.prefix = "" + +if args.totalshares < 3: + print "Invalid parameters, totalshares is required to be >= 3\nPlease see the accompanying documentation." + sys.exit(1) +if args.totalshares > 256: + print "Invalid parameters, totalshares is required to be <= 256\nPlease see the accompanying documentation." + sys.exit(1) +if args.requiredshares < 2: + print "Invalid parameters, requiredshares is required to be >= 2\nPlease see the accompanying documentation." + sys.exit(1) +if args.requiredshares >= args.totalshares: + print "Invalid parameters, requiredshares is required to be < totalshares\nPlease see the accompanying documentation." + sys.exit(1) + +args.inputfile.seek(0, 2) +fsize = args.inputfile.tell() +args.inputfile.seek(0, 0) +ret = filefec.encode_to_files(args.inputfile, fsize, args.output_dir, args.prefix, args.requiredshares, args.totalshares, args.suffix, args.force, args.verbose) + +sys.exit(ret) diff --git a/zfec/bin/unfec b/zfec/bin/unfec new file mode 100755 index 0000000..0befb6b --- /dev/null +++ b/zfec/bin/unfec @@ -0,0 +1,49 @@ +#!/usr/bin/env python + +# import bindann +# import bindann.monkeypatch.all + +import os, sys + +from fec.util import argparse + +import fec +from fec import filefec +from fec.util.version import Version +__version__ = Version("1.0.0a1-0-STABLE") + +if '-V' in sys.argv or '--version' in sys.argv: + print "zfec library version: ", fec.__version__ + print "fec command-line tool version: ", __version__ + sys.exit(0) + +parser = argparse.ArgumentParser(description="Decode data from share files.") + +parser.add_argument('-o', '--outputfile', required=True, help='file to write the resulting data to, or "-" for stdout', type=str, metavar='OUTF') +parser.add_argument('sharefiles', nargs='*', help='shares file to read the encoded data from', type=argparse.FileType('rb'), metavar='SHAREFILE') +parser.add_argument('-v', '--verbose', help='print out messages about progress', action='store_true') +parser.add_argument('-f', '--force', help='overwrite any file which already in place of the output file', action='store_true') +parser.add_argument('-V', '--version', help='print out version number and exit', action='store_true') +args = parser.parse_args() + +if len(args.sharefiles) < 2: + print "At least two sharefiles are required." + sys.exit(1) + +if args.force: + outf = open(args.outputfile, 'wb') +else: + try: + outfd = os.open(args.outputfile, os.O_WRONLY|os.O_CREAT|os.O_EXCL) + except OSError: + print "There is already a file named %r -- aborting. Use --force to overwrite." % (args.outputfile,) + sys.exit(2) + outf = os.fdopen(outfd, "wb") + +try: + ret = filefec.decode_from_files(outf, args.sharefiles, args.verbose) +except filefec.InsufficientShareFilesError, e: + print str(e) + sys.exit(3) + +sys.exit(ret) diff --git a/zfec/fec/__init__.py b/zfec/fec/__init__.py new file mode 100644 index 0000000..42c2646 --- /dev/null +++ b/zfec/fec/__init__.py @@ -0,0 +1,21 @@ +""" +zfec -- fast forward error correction library with Python interface + +maintainer web site: U{http://zooko.com/} + +zfec web site: U{http://www.allmydata.com/source/zfec} +""" + +from util.version import Version + +# For an explanation of what the parts of the version string mean, +# please see pyutil.version. +__version__ = Version("1.0.0a1-2-STABLE") + +# Please put a URL or other note here which shows where to get the branch of +# development from which this version grew. +__sources__ = ["http://www.allmydata.com/source/zfec",] + +from _fec import Encoder, Decoder, Error +import filefec + diff --git a/zfec/fec/_fecmodule.c b/zfec/fec/_fecmodule.c new file mode 100644 index 0000000..c437e8b --- /dev/null +++ b/zfec/fec/_fecmodule.c @@ -0,0 +1,614 @@ +/** + * zfec -- fast forward error correction library with Python interface + * + * Copyright (C) 2007 Allmydata, Inc. + * Author: Zooko Wilcox-O'Hearn + * mailto:zooko@zooko.com + * + * This file is part of zfec. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. This program also + * comes with the added permission that, in the case that you are obligated to + * release a derived work under this licence (as per section 2.b of the GPL), + * you may delay the fulfillment of this obligation for up to 12 months. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +/** + * based on fecmodule.c by the Mnet Project, especially Myers Carpenter and + * Hauke Johannknecht + */ + +#include +#include + +#if (PY_VERSION_HEX < 0x02050000) +typedef int Py_ssize_t; +#endif + +#include "fec.h" + +#include "stdarg.h" + +static PyObject *py_fec_error; +static PyObject *py_raise_fec_error (const char *format, ...); + +static char fec__doc__[] = "\ +FEC - Forward Error Correction \n\ +"; + +static PyObject * +py_raise_fec_error(const char *format, ...) { + char exceptionMsg[1024]; + va_list ap; + + va_start (ap, format); + vsnprintf (exceptionMsg, 1024, format, ap); + va_end (ap); + exceptionMsg[1023]='\0'; + PyErr_SetString (py_fec_error, exceptionMsg); + return NULL; +} + +static char Encoder__doc__[] = "\ +Hold static encoder state (an in-memory table for matrix multiplication), and k and m parameters, and provide {encode()} method.\n\n\ +@param k: the number of packets required for reconstruction \n\ +@param m: the number of packets generated \n\ +"; + +typedef struct { + PyObject_HEAD + + /* expose these */ + short kk; + short mm; + + /* internal */ + fec_t* fec_matrix; +} Encoder; + +static PyObject * +Encoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { + Encoder *self; + + self = (Encoder*)type->tp_alloc(type, 0); + if (self != NULL) { + self->kk = 0; + self->mm = 0; + self->fec_matrix = NULL; + } + + return (PyObject *)self; +} + +static int +Encoder_init(Encoder *self, PyObject *args, PyObject *kwdict) { + static char *kwlist[] = { + "k", + "m", + NULL + }; + int ink, inm; + if (!PyArg_ParseTupleAndKeywords(args, kwdict, "ii", kwlist, &ink, &inm)) + return -1; + + if (ink < 1) { + py_raise_fec_error("Precondition violation: first argument is required to be greater than or equal to 1, but it was %d", self->kk); + return -1; + } + if (inm < 1) { + py_raise_fec_error("Precondition violation: second argument is required to be greater than or equal to 1, but it was %d", self->mm); + return -1; + } + if (inm > 256) { + py_raise_fec_error("Precondition violation: second argument is required to be less than or equal to 256, but it was %d", self->mm); + return -1; + } + if (ink > inm) { + py_raise_fec_error("Precondition violation: first argument is required to be less than or equal to the second argument, but they were %d and %d respectively", ink, inm); + return -1; + } + self->kk = (short)ink; + self->mm = (short)inm; + self->fec_matrix = fec_new(self->kk, self->mm); + + return 0; +} + +static char Encoder_encode__doc__[] = "\ +Encode data into m packets.\n\ +\n\ +@param inblocks: a sequence of k buffers of data to encode -- these are the k primary blocks, i.e. the input data split into k pieces (for best performance, make it a tuple instead of a list); All blocks are required to be the same length.\n\ +@param desired_blocks_nums optional sequence of blocknums indicating which blocks to produce and return; If None, all m blocks will be returned (in order). (For best performance, make it a tuple instead of a list.)\n\ +@returns: a list of buffers containing the requested blocks; Note that if any of the input blocks were 'primary blocks', i.e. their blocknum was < k, then the result sequence will contain a Python reference to the same Python object as was passed in. As long as the Python object in question is immutable (i.e. a string) then you don't have to think about this detail, but if it is mutable (i.e. an array), then you have to be aware that if you subsequently mutate the contents of that object then that will also change the contents of the sequence that was returned from this call to encode().\n\ +"; + +static PyObject * +Encoder_encode(Encoder *self, PyObject *args) { + PyObject* inblocks; + PyObject* desired_blocks_nums = NULL; /* The blocknums of the blocks that should be returned. */ + PyObject* result = NULL; + + if (!PyArg_ParseTuple(args, "O|O", &inblocks, &desired_blocks_nums)) + return NULL; + + gf* check_blocks_produced[self->mm - self->kk]; /* This is an upper bound -- we will actually use only num_check_blocks_produced of these elements (see below). */ + PyObject* pystrs_produced[self->mm - self->kk]; /* This is an upper bound -- we will actually use only num_check_blocks_produced of these elements (see below). */ + unsigned num_check_blocks_produced = 0; /* The first num_check_blocks_produced elements of the check_blocks_produced array and of the pystrs_produced array will be used. */ + const gf* incblocks[self->kk]; + unsigned num_desired_blocks; + PyObject* fast_desired_blocks_nums = NULL; + PyObject** fast_desired_blocks_nums_items; + unsigned c_desired_blocks_nums[self->mm]; + unsigned c_desired_checkblocks_ids[self->mm - self->kk]; + unsigned i; + PyObject* fastinblocks = NULL; + + for (i=0; imm - self->kk; i++) + pystrs_produced[i] = NULL; + if (desired_blocks_nums) { + fast_desired_blocks_nums = PySequence_Fast(desired_blocks_nums, "Second argument (optional) was not a sequence."); + if (!fast_desired_blocks_nums) + goto err; + num_desired_blocks = PySequence_Fast_GET_SIZE(fast_desired_blocks_nums); + fast_desired_blocks_nums_items = PySequence_Fast_ITEMS(fast_desired_blocks_nums); + for (i=0; i= self->kk) + num_check_blocks_produced++; + } + } else { + num_desired_blocks = self->mm; + for (i=0; imm - self->kk; + } + + fastinblocks = PySequence_Fast(inblocks, "First argument was not a sequence."); + if (!fastinblocks) + goto err; + + if (PySequence_Fast_GET_SIZE(fastinblocks) != self->kk) { + py_raise_fec_error("Precondition violation: Wrong length -- first argument is required to contain exactly k blocks. len(first): %d, k: %d", PySequence_Fast_GET_SIZE(fastinblocks), self->kk); + goto err; + } + + /* Construct a C array of gf*'s of the input data. */ + PyObject** fastinblocksitems = PySequence_Fast_ITEMS(fastinblocks); + if (!fastinblocksitems) + goto err; + Py_ssize_t sz, oldsz = 0; + for (i=0; ikk; i++) { + if (!PyObject_CheckReadBuffer(fastinblocksitems[i])) { + py_raise_fec_error("Precondition violation: %u'th item is required to offer the single-segment read character buffer protocol, but it does not.\n", i); + goto err; + } + if (PyObject_AsReadBuffer(fastinblocksitems[i], (const void**)&(incblocks[i]), &sz)) + goto err; + if (oldsz != 0 && oldsz != sz) { + py_raise_fec_error("Precondition violation: Input blocks are required to be all the same length. oldsz: %Zu, sz: %Zu\n", oldsz, sz); + goto err; + } + oldsz = sz; + } + + /* Allocate space for all of the check blocks. */ + unsigned char check_block_index = 0; /* index into the check_blocks_produced and (parallel) pystrs_produced arrays */ + for (i=0; i= self->kk) { + c_desired_checkblocks_ids[check_block_index] = c_desired_blocks_nums[i]; + pystrs_produced[check_block_index] = PyString_FromStringAndSize(NULL, sz); + if (pystrs_produced[check_block_index] == NULL) + goto err; + check_blocks_produced[check_block_index] = (gf*)PyString_AsString(pystrs_produced[check_block_index]); + if (check_blocks_produced[check_block_index] == NULL) + goto err; + check_block_index++; + } + } + assert (check_block_index == num_check_blocks_produced); + + /* Encode any check blocks that are needed. */ + fec_encode(self->fec_matrix, incblocks, check_blocks_produced, c_desired_checkblocks_ids, num_check_blocks_produced, sz); + + /* Wrap all requested blocks up into a Python list of Python strings. */ + result = PyList_New(num_desired_blocks); + if (result == NULL) + goto err; + check_block_index = 0; + for (i=0; ikk) { + Py_INCREF(fastinblocksitems[c_desired_blocks_nums[i]]); + if (PyList_SetItem(result, i, fastinblocksitems[c_desired_blocks_nums[i]]) == -1) { + Py_DECREF(fastinblocksitems[c_desired_blocks_nums[i]]); + goto err; + } + } else { + if (PyList_SetItem(result, i, pystrs_produced[check_block_index]) == -1) + goto err; + pystrs_produced[check_block_index] = NULL; + check_block_index++; + } + } + + goto cleanup; + err: + for (i=0; ifec_matrix); + self->ob_type->tp_free((PyObject*)self); +} + +static PyMethodDef Encoder_methods[] = { + {"encode", (PyCFunction)Encoder_encode, METH_VARARGS, Encoder_encode__doc__}, + {NULL}, +}; + +static PyMemberDef Encoder_members[] = { + {"k", T_SHORT, offsetof(Encoder, kk), READONLY, "k"}, + {"m", T_SHORT, offsetof(Encoder, mm), READONLY, "m"}, + {NULL} /* Sentinel */ +}; + +static PyTypeObject Encoder_type = { + PyObject_HEAD_INIT(NULL) + 0, /*ob_size*/ + "fec.Encoder", /*tp_name*/ + sizeof(Encoder), /*tp_basicsize*/ + 0, /*tp_itemsize*/ + (destructor)Encoder_dealloc, /*tp_dealloc*/ + 0, /*tp_print*/ + 0, /*tp_getattr*/ + 0, /*tp_setattr*/ + 0, /*tp_compare*/ + 0, /*tp_repr*/ + 0, /*tp_as_number*/ + 0, /*tp_as_sequence*/ + 0, /*tp_as_mapping*/ + 0, /*tp_hash */ + 0, /*tp_call*/ + 0, /*tp_str*/ + 0, /*tp_getattro*/ + 0, /*tp_setattro*/ + 0, /*tp_as_buffer*/ + Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/ + Encoder__doc__, /* tp_doc */ + 0, /* tp_traverse */ + 0, /* tp_clear */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ + Encoder_methods, /* tp_methods */ + Encoder_members, /* tp_members */ + 0, /* tp_getset */ + 0, /* tp_base */ + 0, /* tp_dict */ + 0, /* tp_descr_get */ + 0, /* tp_descr_set */ + 0, /* tp_dictoffset */ + (initproc)Encoder_init, /* tp_init */ + 0, /* tp_alloc */ + Encoder_new, /* tp_new */ +}; + +static char Decoder__doc__[] = "\ +Hold static decoder state (an in-memory table for matrix multiplication), and k and m parameters, and provide {decode()} method.\n\n\ +@param k: the number of packets required for reconstruction \n\ +@param m: the number of packets generated \n\ +"; + +typedef struct { + PyObject_HEAD + + /* expose these */ + short kk; + short mm; + + /* internal */ + fec_t* fec_matrix; +} Decoder; + +static PyObject * +Decoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { + Decoder *self; + + self = (Decoder*)type->tp_alloc(type, 0); + if (self != NULL) { + self->kk = 0; + self->mm = 0; + self->fec_matrix = NULL; + } + + return (PyObject *)self; +} + +static int +Decoder_init(Encoder *self, PyObject *args, PyObject *kwdict) { + static char *kwlist[] = { + "k", + "m", + NULL + }; + + int ink, inm; + if (!PyArg_ParseTupleAndKeywords(args, kwdict, "ii", kwlist, &ink, &inm)) + return -1; + + if (ink < 1) { + py_raise_fec_error("Precondition violation: first argument is required to be greater than or equal to 1, but it was %d", self->kk); + return -1; + } + if (inm < 1) { + py_raise_fec_error("Precondition violation: second argument is required to be greater than or equal to 1, but it was %d", self->mm); + return -1; + } + if (inm > 256) { + py_raise_fec_error("Precondition violation: second argument is required to be less than or equal to 256, but it was %d", self->mm); + return -1; + } + if (ink > inm) { + py_raise_fec_error("Precondition violation: first argument is required to be less than or equal to the second argument, but they were %d and %d respectively", ink, inm); + return -1; + } + self->kk = (short)ink; + self->mm = (short)inm; + self->fec_matrix = fec_new(self->kk, self->mm); + + return 0; +} + +#define SWAP(a,b,t) {t tmp; tmp=a; a=b; b=tmp;} + +static char Decoder_decode__doc__[] = "\ +Decode a list blocks into a list of segments.\n\ +@param blocks a sequence of buffers containing block data (for best performance, make it a tuple instead of a list)\n\ +@param blocknums a sequence of integers of the blocknum for each block in blocks (for best performance, make it a tuple instead of a list)\n\ +\n\ +@return a list of strings containing the segment data (i.e. ''.join(retval) yields a string containing the decoded data)\n\ +"; + +static PyObject * +Decoder_decode(Decoder *self, PyObject *args) { + PyObject*restrict blocks; + PyObject*restrict blocknums; + PyObject* result = NULL; + + if (!PyArg_ParseTuple(args, "OO", &blocks, &blocknums)) + return NULL; + + const gf*restrict cblocks[self->kk]; + unsigned cblocknums[self->kk]; + gf*restrict recoveredcstrs[self->kk]; /* self->kk is actually an upper bound -- we probably won't need all of this space. */ + PyObject*restrict recoveredpystrs[self->kk]; /* self->kk is actually an upper bound -- we probably won't need all of this space. */ + unsigned i; + for (i=0; ikk; i++) + recoveredpystrs[i] = NULL; + PyObject*restrict fastblocknums = NULL; + PyObject*restrict fastblocks = PySequence_Fast(blocks, "First argument was not a sequence."); + if (!fastblocks) + goto err; + fastblocknums = PySequence_Fast(blocknums, "Second argument was not a sequence."); + if (!fastblocknums) + goto err; + + if (PySequence_Fast_GET_SIZE(fastblocks) != self->kk) { + py_raise_fec_error("Precondition violation: Wrong length -- first argument is required to contain exactly k blocks. len(first): %d, k: %d", PySequence_Fast_GET_SIZE(fastblocks), self->kk); + goto err; + } + if (PySequence_Fast_GET_SIZE(fastblocknums) != self->kk) { + py_raise_fec_error("Precondition violation: Wrong length -- blocknums is required to contain exactly k blocks. len(blocknums): %d, k: %d", PySequence_Fast_GET_SIZE(fastblocknums), self->kk); + goto err; + } + + /* Construct a C array of gf*'s of the data and another of C ints of the blocknums. */ + unsigned needtorecover=0; + PyObject** fastblocknumsitems = PySequence_Fast_ITEMS(fastblocknums); + if (!fastblocknumsitems) + goto err; + PyObject** fastblocksitems = PySequence_Fast_ITEMS(fastblocks); + if (!fastblocksitems) + goto err; + Py_ssize_t sz, oldsz = 0; + for (i=0; ikk; i++) { + if (!PyInt_Check(fastblocknumsitems[i])) { + py_raise_fec_error("Precondition violation: second argument is required to contain int."); + goto err; + } + long tmpl = PyInt_AsLong(fastblocknumsitems[i]); + if (tmpl < 0 || tmpl > 255) { + py_raise_fec_error("Precondition violation: block nums can't be less than zero or greater than 255. %ld\n", tmpl); + goto err; + } + cblocknums[i] = (unsigned)tmpl; + if (cblocknums[i] >= self->kk) + needtorecover+=1; + + if (!PyObject_CheckReadBuffer(fastblocksitems[i])) { + py_raise_fec_error("Precondition violation: %u'th item is required to offer the single-segment read character buffer protocol, but it does not.\n", i); + goto err; + } + if (PyObject_AsReadBuffer(fastblocksitems[i], (const void**)&(cblocks[i]), &sz)) + goto err; + if (oldsz != 0 && oldsz != sz) { + py_raise_fec_error("Precondition violation: Input blocks are required to be all the same length. oldsz: %Zu, sz: %Zu\n", oldsz, sz); + goto err; + } + oldsz = sz; + } + + /* move src packets into position */ + for (i=0; ikk;) { + if (cblocknums[i] >= self->kk || cblocknums[i] == i) + i++; + else { + /* put pkt in the right position. */ + unsigned c = cblocknums[i]; + + SWAP (cblocknums[i], cblocknums[c], int); + SWAP (cblocks[i], cblocks[c], const gf*); + SWAP (fastblocksitems[i], fastblocksitems[c], PyObject*); + } + } + + /* Allocate space for all of the recovered blocks. */ + for (i=0; ifec_matrix, cblocks, recoveredcstrs, cblocknums, sz); + + /* Wrap up both original primary blocks and decoded blocks into a Python list of Python strings. */ + unsigned nextrecoveredix=0; + result = PyList_New(self->kk); + if (result == NULL) + goto err; + for (i=0; ikk; i++) { + if (cblocknums[i] == i) { + /* Original primary block. */ + Py_INCREF(fastblocksitems[i]); + if (PyList_SetItem(result, i, fastblocksitems[i]) == -1) { + Py_DECREF(fastblocksitems[i]); + goto err; + } + } else { + /* Recovered block. */ + if (PyList_SetItem(result, i, recoveredpystrs[nextrecoveredix]) == -1) + goto err; + recoveredpystrs[nextrecoveredix] = NULL; + nextrecoveredix++; + } + } + + goto cleanup; + err: + for (i=0; ikk; i++) + Py_XDECREF(recoveredpystrs[i]); + Py_XDECREF(result); result = NULL; + cleanup: + Py_XDECREF(fastblocks); fastblocks=NULL; + Py_XDECREF(fastblocknums); fastblocknums=NULL; + return result; +} + +static void +Decoder_dealloc(Decoder * self) { + fec_free(self->fec_matrix); + self->ob_type->tp_free((PyObject*)self); +} + +static PyMethodDef Decoder_methods[] = { + {"decode", (PyCFunction)Decoder_decode, METH_VARARGS, Decoder_decode__doc__}, + {NULL}, +}; + +static PyMemberDef Decoder_members[] = { + {"k", T_SHORT, offsetof(Encoder, kk), READONLY, "k"}, + {"m", T_SHORT, offsetof(Encoder, mm), READONLY, "m"}, + {NULL} /* Sentinel */ +}; + +static PyTypeObject Decoder_type = { + PyObject_HEAD_INIT(NULL) + 0, /*ob_size*/ + "fec.Decoder", /*tp_name*/ + sizeof(Decoder), /*tp_basicsize*/ + 0, /*tp_itemsize*/ + (destructor)Decoder_dealloc, /*tp_dealloc*/ + 0, /*tp_print*/ + 0, /*tp_getattr*/ + 0, /*tp_setattr*/ + 0, /*tp_compare*/ + 0, /*tp_repr*/ + 0, /*tp_as_number*/ + 0, /*tp_as_sequence*/ + 0, /*tp_as_mapping*/ + 0, /*tp_hash */ + 0, /*tp_call*/ + 0, /*tp_str*/ + 0, /*tp_getattro*/ + 0, /*tp_setattro*/ + 0, /*tp_as_buffer*/ + Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/ + Decoder__doc__, /* tp_doc */ + 0, /* tp_traverse */ + 0, /* tp_clear */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ + Decoder_methods, /* tp_methods */ + Decoder_members, /* tp_members */ + 0, /* tp_getset */ + 0, /* tp_base */ + 0, /* tp_dict */ + 0, /* tp_descr_get */ + 0, /* tp_descr_set */ + 0, /* tp_dictoffset */ + (initproc)Decoder_init, /* tp_init */ + 0, /* tp_alloc */ + Decoder_new, /* tp_new */ +}; + +static PyMethodDef fec_methods[] = { + {NULL} +}; + +#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */ +#define PyMODINIT_FUNC void +#endif +PyMODINIT_FUNC +init_fec(void) { + PyObject *module; + PyObject *module_dict; + + if (PyType_Ready(&Encoder_type) < 0) + return; + if (PyType_Ready(&Decoder_type) < 0) + return; + + module = Py_InitModule3("_fec", fec_methods, fec__doc__); + if (module == NULL) + return; + + Py_INCREF(&Encoder_type); + Py_INCREF(&Decoder_type); + + PyModule_AddObject(module, "Encoder", (PyObject *)&Encoder_type); + PyModule_AddObject(module, "Decoder", (PyObject *)&Decoder_type); + + module_dict = PyModule_GetDict(module); + py_fec_error = PyErr_NewException("_fec.Error", NULL, NULL); + PyDict_SetItemString(module_dict, "Error", py_fec_error); +} + diff --git a/zfec/fec/easyfec.py b/zfec/fec/easyfec.py new file mode 100644 index 0000000..8fc1506 --- /dev/null +++ b/zfec/fec/easyfec.py @@ -0,0 +1,37 @@ +import fec + +# div_ceil() was copied from the pyutil library. +def div_ceil(n, d): + """ + The smallest integer k such that k*d >= n. + """ + return (n/d) + (n%d != 0) + +class Encoder(object): + def __init__(self, k, m): + self.fec = fec.Encoder(k, m) + + def encode(self, data): + """ + @param data: string + """ + chunksize = div_ceil(len(data), self.fec.k) + numchunks = div_ceil(len(data), chunksize) + l = [ data[i:i+chunksize] for i in range(0, len(data), chunksize) ] + # padding + if len(l[-1]) != len(l[0]): + l[-1] = l[-1] + ('\x00'*(len(l[0])-len(l[-1]))) + res = self.fec.encode(l) + return res + +class Decoder(object): + def __init__(self, k, m): + self.fec = fec.Decoder(k, m) + + def decode(self, blocks, sharenums, padlen=0): + blocks = self.fec.decode(blocks, sharenums) + data = ''.join(blocks) + if padlen: + data = data[:-padlen] + return data + diff --git a/zfec/fec/fec.c b/zfec/fec/fec.c new file mode 100644 index 0000000..bf04429 --- /dev/null +++ b/zfec/fec/fec.c @@ -0,0 +1,616 @@ +/** + * zfec -- fast forward error correction library with Python interface + * + * Copyright (C) 2007 Allmydata, Inc. + * Author: Zooko Wilcox-O'Hearn + * mailto:zooko@zooko.com + * + * This file is part of zfec. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. This program also comes with the added permission that, + * in the case that you are obligated to release a derived work under this + * licence (as per section 2.b of the GPL), you may delay the fulfillment of + * this obligation for up to 12 months. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +/* + * Much of this work is derived from the "fec" software by Luigi Rizzo, et + * al., the copyright notice and licence terms of which are included below + * for reference. + * fec.c -- forward error correction based on Vandermonde matrices + * 980624 + * (C) 1997-98 Luigi Rizzo (luigi@iet.unipi.it) + * + * Portions derived from code by Phil Karn (karn@ka9q.ampr.org), + * Robert Morelos-Zaragoza (robert@spectra.eng.hawaii.edu) and Hari + * Thirumoorthy (harit@spectra.eng.hawaii.edu), Aug 1995 + * + * Modifications by Dan Rubenstein (see Modifications.txt for + * their description. + * Modifications (C) 1998 Dan Rubenstein (drubenst@cs.umass.edu) + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS + * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, + * OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, + * OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY + * OF SUCH DAMAGE. + */ + +#include +#include +#include +#include + +#include "fec.h" + + +/* + * If you get a error returned (negative value) from a fec_* function, + * look in here for the error message. + */ + +#define FEC_ERROR_SIZE 1025 +char fec_error[FEC_ERROR_SIZE+1]; + +#define ERR(...) (snprintf(fec_error, FEC_ERROR_SIZE, __VA_ARGS__)) + +/* + * Primitive polynomials - see Lin & Costello, Appendix A, + * and Lee & Messerschmitt, p. 453. + */ +static const char*const Pp="101110001"; + + +/* + * To speed up computations, we have tables for logarithm, exponent and + * inverse of a number. We use a table for multiplication as well (it takes + * 64K, no big deal even on a PDA, especially because it can be + * pre-initialized an put into a ROM!), otherwhise we use a table of + * logarithms. In any case the macro gf_mul(x,y) takes care of + * multiplications. + */ + +static gf gf_exp[510]; /* index->poly form conversion table */ +static int gf_log[256]; /* Poly->index form conversion table */ +static gf inverse[256]; /* inverse of field elem. */ + /* inv[\alpha**i]=\alpha**(GF_SIZE-i-1) */ + +/* + * modnn(x) computes x % GF_SIZE, where GF_SIZE is 2**GF_BITS - 1, + * without a slow divide. + */ +static inline gf +modnn(int x) { + while (x >= 255) { + x -= 255; + x = (x >> 8) + (x & 255); + } + return x; +} + +#define SWAP(a,b,t) {t tmp; tmp=a; a=b; b=tmp;} + +/* + * gf_mul(x,y) multiplies two numbers. It is much faster to use a + * multiplication table. + * + * USE_GF_MULC, GF_MULC0(c) and GF_ADDMULC(x) can be used when multiplying + * many numbers by the same constant. In this case the first call sets the + * constant, and others perform the multiplications. A value related to the + * multiplication is held in a local variable declared with USE_GF_MULC . See + * usage in _addmul1(). + */ +static gf gf_mul_table[256][256]; + +#define gf_mul(x,y) gf_mul_table[x][y] + +#define USE_GF_MULC register gf * __gf_mulc_ +#define GF_MULC0(c) __gf_mulc_ = gf_mul_table[c] +#define GF_ADDMULC(dst, x) dst ^= __gf_mulc_[x] + +/* + * Generate GF(2**m) from the irreducible polynomial p(X) in p[0]..p[m] + * Lookup tables: + * index->polynomial form gf_exp[] contains j= \alpha^i; + * polynomial form -> index form gf_log[ j = \alpha^i ] = i + * \alpha=x is the primitive element of GF(2^m) + * + * For efficiency, gf_exp[] has size 2*GF_SIZE, so that a simple + * multiplication of two numbers can be resolved without calling modnn + */ +static void +_init_mul_table(void) { + int i, j; + for (i = 0; i < 256; i++) + for (j = 0; j < 256; j++) + gf_mul_table[i][j] = gf_exp[modnn (gf_log[i] + gf_log[j])]; + + for (j = 0; j < 256; j++) + gf_mul_table[0][j] = gf_mul_table[j][0] = 0; +} + +/* + * i use malloc so many times, it is easier to put checks all in + * one place. + */ +static void * +my_malloc (int sz, char *err_string) { + void *p = malloc (sz); + if (p == NULL) { + ERR("Malloc failure allocating %s\n", err_string); + exit (1); + } + return p; +} + +#define NEW_GF_MATRIX(rows, cols) \ + (gf*)my_malloc(rows * cols, " ## __LINE__ ## " ) + +/* + * initialize the data structures used for computations in GF. + */ +static void +generate_gf (void) { + int i; + gf mask; + + mask = 1; /* x ** 0 = 1 */ + gf_exp[8] = 0; /* will be updated at the end of the 1st loop */ + /* + * first, generate the (polynomial representation of) powers of \alpha, + * which are stored in gf_exp[i] = \alpha ** i . + * At the same time build gf_log[gf_exp[i]] = i . + * The first 8 powers are simply bits shifted to the left. + */ + for (i = 0; i < 8; i++, mask <<= 1) { + gf_exp[i] = mask; + gf_log[gf_exp[i]] = i; + /* + * If Pp[i] == 1 then \alpha ** i occurs in poly-repr + * gf_exp[8] = \alpha ** 8 + */ + if (Pp[i] == '1') + gf_exp[8] ^= mask; + } + /* + * now gf_exp[8] = \alpha ** 8 is complete, so can also + * compute its inverse. + */ + gf_log[gf_exp[8]] = 8; + /* + * Poly-repr of \alpha ** (i+1) is given by poly-repr of + * \alpha ** i shifted left one-bit and accounting for any + * \alpha ** 8 term that may occur when poly-repr of + * \alpha ** i is shifted. + */ + mask = 1 << 7; + for (i = 9; i < 255; i++) { + if (gf_exp[i - 1] >= mask) + gf_exp[i] = gf_exp[8] ^ ((gf_exp[i - 1] ^ mask) << 1); + else + gf_exp[i] = gf_exp[i - 1] << 1; + gf_log[gf_exp[i]] = i; + } + /* + * log(0) is not defined, so use a special value + */ + gf_log[0] = 255; + /* set the extended gf_exp values for fast multiply */ + for (i = 0; i < 255; i++) + gf_exp[i + 255] = gf_exp[i]; + + /* + * again special cases. 0 has no inverse. This used to + * be initialized to 255, but it should make no difference + * since noone is supposed to read from here. + */ + inverse[0] = 0; + inverse[1] = 1; + for (i = 2; i <= 255; i++) + inverse[i] = gf_exp[255 - gf_log[i]]; +} + +/* + * Various linear algebra operations that i use often. + */ + +/* + * addmul() computes dst[] = dst[] + c * src[] + * This is used often, so better optimize it! Currently the loop is + * unrolled 16 times, a good value for 486 and pentium-class machines. + * The case c=0 is also optimized, whereas c=1 is not. These + * calls are unfrequent in my typical apps so I did not bother. + */ +#define addmul(dst, src, c, sz) \ + if (c != 0) _addmul1(dst, src, c, sz) + +#define UNROLL 16 /* 1, 4, 8, 16 */ +static void +_addmul1(register gf*restrict dst, const register gf*restrict src, gf c, size_t sz) { + USE_GF_MULC; + const gf* lim = &dst[sz - UNROLL + 1]; + + GF_MULC0 (c); + +#if (UNROLL > 1) /* unrolling by 8/16 is quite effective on the pentium */ + for (; dst < lim; dst += UNROLL, src += UNROLL) { + GF_ADDMULC (dst[0], src[0]); + GF_ADDMULC (dst[1], src[1]); + GF_ADDMULC (dst[2], src[2]); + GF_ADDMULC (dst[3], src[3]); +#if (UNROLL > 4) + GF_ADDMULC (dst[4], src[4]); + GF_ADDMULC (dst[5], src[5]); + GF_ADDMULC (dst[6], src[6]); + GF_ADDMULC (dst[7], src[7]); +#endif +#if (UNROLL > 8) + GF_ADDMULC (dst[8], src[8]); + GF_ADDMULC (dst[9], src[9]); + GF_ADDMULC (dst[10], src[10]); + GF_ADDMULC (dst[11], src[11]); + GF_ADDMULC (dst[12], src[12]); + GF_ADDMULC (dst[13], src[13]); + GF_ADDMULC (dst[14], src[14]); + GF_ADDMULC (dst[15], src[15]); +#endif + } +#endif + lim += UNROLL - 1; + for (; dst < lim; dst++, src++) /* final components */ + GF_ADDMULC (*dst, *src); +} + +/* + * computes C = AB where A is n*k, B is k*m, C is n*m + */ +static void +_matmul(gf * a, gf * b, gf * c, unsigned n, unsigned k, unsigned m) { + unsigned row, col, i; + + for (row = 0; row < n; row++) { + for (col = 0; col < m; col++) { + gf *pa = &a[row * k]; + gf *pb = &b[col]; + gf acc = 0; + for (i = 0; i < k; i++, pa++, pb += m) + acc ^= gf_mul (*pa, *pb); + c[row * m + col] = acc; + } + } +} + +/* + * _invert_mat() takes a matrix and produces its inverse + * k is the size of the matrix. + * (Gauss-Jordan, adapted from Numerical Recipes in C) + * Return non-zero if singular. + */ +static void +_invert_mat(gf* src, unsigned k) { + gf c, *p; + unsigned irow = 0; + unsigned icol = 0; + unsigned row, col, i, ix; + + unsigned* indxc = (unsigned*) my_malloc (k * sizeof(unsigned), "indxc"); + unsigned* indxr = (unsigned*) my_malloc (k * sizeof(unsigned), "indxr"); + unsigned* ipiv = (unsigned*) my_malloc (k * sizeof(unsigned), "ipiv"); + gf *id_row = NEW_GF_MATRIX (1, k); + gf *temp_row = NEW_GF_MATRIX (1, k); + + memset (id_row, '\0', k * sizeof (gf)); + /* + * ipiv marks elements already used as pivots. + */ + for (i = 0; i < k; i++) + ipiv[i] = 0; + + for (col = 0; col < k; col++) { + gf *pivot_row; + /* + * Zeroing column 'col', look for a non-zero element. + * First try on the diagonal, if it fails, look elsewhere. + */ + if (ipiv[col] != 1 && src[col * k + col] != 0) { + irow = col; + icol = col; + goto found_piv; + } + for (row = 0; row < k; row++) { + if (ipiv[row] != 1) { + for (ix = 0; ix < k; ix++) { + if (ipiv[ix] == 0) { + if (src[row * k + ix] != 0) { + irow = row; + icol = ix; + goto found_piv; + } + } else if (ipiv[ix] > 1) { + ERR("singular matrix"); + goto fail; + } + } + } + } + found_piv: + ++(ipiv[icol]); + /* + * swap rows irow and icol, so afterwards the diagonal + * element will be correct. Rarely done, not worth + * optimizing. + */ + if (irow != icol) + for (ix = 0; ix < k; ix++) + SWAP (src[irow * k + ix], src[icol * k + ix], gf); + indxr[col] = irow; + indxc[col] = icol; + pivot_row = &src[icol * k]; + c = pivot_row[icol]; + if (c == 0) { + ERR("singular matrix 2"); + goto fail; + } + if (c != 1) { /* otherwhise this is a NOP */ + /* + * this is done often , but optimizing is not so + * fruitful, at least in the obvious ways (unrolling) + */ + c = inverse[c]; + pivot_row[icol] = 1; + for (ix = 0; ix < k; ix++) + pivot_row[ix] = gf_mul (c, pivot_row[ix]); + } + /* + * from all rows, remove multiples of the selected row + * to zero the relevant entry (in fact, the entry is not zero + * because we know it must be zero). + * (Here, if we know that the pivot_row is the identity, + * we can optimize the addmul). + */ + id_row[icol] = 1; + if (memcmp (pivot_row, id_row, k * sizeof (gf)) != 0) { + for (p = src, ix = 0; ix < k; ix++, p += k) { + if (ix != icol) { + c = p[icol]; + p[icol] = 0; + addmul (p, pivot_row, c, k); + } + } + } + id_row[icol] = 0; + } /* done all columns */ + for (col = k; col > 0; col--) + if (indxr[col-1] != indxc[col-1]) + for (row = 0; row < k; row++) + SWAP (src[row * k + indxr[col-1]], src[row * k + indxc[col-1]], gf); + fail: + free (indxc); + free (indxr); + free (ipiv); + free (id_row); + free (temp_row); + return; +} + +/* + * fast code for inverting a vandermonde matrix. + * + * NOTE: It assumes that the matrix is not singular and _IS_ a vandermonde + * matrix. Only uses the second column of the matrix, containing the p_i's. + * + * Algorithm borrowed from "Numerical recipes in C" -- sec.2.8, but largely + * revised for my purposes. + * p = coefficients of the matrix (p_i) + * q = values of the polynomial (known) + */ +void +_invert_vdm (gf* src, unsigned k) { + unsigned i, j, row, col; + gf *b, *c, *p; + gf t, xx; + + if (k == 1) /* degenerate case, matrix must be p^0 = 1 */ + return; + /* + * c holds the coefficient of P(x) = Prod (x - p_i), i=0..k-1 + * b holds the coefficient for the matrix inversion + */ + c = NEW_GF_MATRIX (1, k); + b = NEW_GF_MATRIX (1, k); + + p = NEW_GF_MATRIX (1, k); + + for (j = 1, i = 0; i < k; i++, j += k) { + c[i] = 0; + p[i] = src[j]; /* p[i] */ + } + /* + * construct coeffs. recursively. We know c[k] = 1 (implicit) + * and start P_0 = x - p_0, then at each stage multiply by + * x - p_i generating P_i = x P_{i-1} - p_i P_{i-1} + * After k steps we are done. + */ + c[k - 1] = p[0]; /* really -p(0), but x = -x in GF(2^m) */ + for (i = 1; i < k; i++) { + gf p_i = p[i]; /* see above comment */ + for (j = k - 1 - (i - 1); j < k - 1; j++) + c[j] ^= gf_mul (p_i, c[j + 1]); + c[k - 1] ^= p_i; + } + + for (row = 0; row < k; row++) { + /* + * synthetic division etc. + */ + xx = p[row]; + t = 1; + b[k - 1] = 1; /* this is in fact c[k] */ + for (i = k - 1; i > 0; i--) { + b[i-1] = c[i] ^ gf_mul (xx, b[i]); + t = gf_mul (xx, t) ^ b[i-1]; + } + for (col = 0; col < k; col++) + src[col * k + row] = gf_mul (inverse[t], b[col]); + } + free (c); + free (b); + free (p); + return; +} + +static int fec_initialized = 0; +static void +init_fec (void) { + generate_gf(); + _init_mul_table(); + fec_initialized = 1; +} + +/* + * This section contains the proper FEC encoding/decoding routines. + * The encoding matrix is computed starting with a Vandermonde matrix, + * and then transforming it into a systematic matrix. + */ + +#define FEC_MAGIC 0xFECC0DEC + +void +fec_free (fec_t *p) { + if (p == NULL || + p->magic != (((FEC_MAGIC ^ p->k) ^ p->n) ^ (unsigned long) (p->enc_matrix))) { + ERR("bad parameters to fec_free"); + return; + } + free (p->enc_matrix); + free (p); +} + +fec_t * +fec_new(unsigned k, unsigned n) { + unsigned row, col; + gf *p, *tmp_m; + + fec_t *retval; + + fec_error[FEC_ERROR_SIZE] = '\0'; + + if (fec_initialized == 0) + init_fec (); + + retval = (fec_t *) my_malloc (sizeof (fec_t), "new_code"); + retval->k = k; + retval->n = n; + retval->enc_matrix = NEW_GF_MATRIX (n, k); + retval->magic = ((FEC_MAGIC ^ k) ^ n) ^ (unsigned long) (retval->enc_matrix); + tmp_m = NEW_GF_MATRIX (n, k); + /* + * fill the matrix with powers of field elements, starting from 0. + * The first row is special, cannot be computed with exp. table. + */ + tmp_m[0] = 1; + for (col = 1; col < k; col++) + tmp_m[col] = 0; + for (p = tmp_m + k, row = 0; row < n - 1; row++, p += k) + for (col = 0; col < k; col++) + p[col] = gf_exp[modnn (row * col)]; + + /* + * quick code to build systematic matrix: invert the top + * k*k vandermonde matrix, multiply right the bottom n-k rows + * by the inverse, and construct the identity matrix at the top. + */ + _invert_vdm (tmp_m, k); /* much faster than _invert_mat */ + _matmul(tmp_m + k * k, tmp_m, retval->enc_matrix + k * k, n - k, k, k); + /* + * the upper matrix is I so do not bother with a slow multiply + */ + memset (retval->enc_matrix, '\0', k * k * sizeof (gf)); + for (p = retval->enc_matrix, col = 0; col < k; col++, p += k + 1) + *p = 1; + free (tmp_m); + + return retval; +} + +void +fec_encode(const fec_t* code, const gf*restrict const*restrict const src, gf*restrict const*restrict const fecs, const unsigned*restrict const block_nums, size_t num_block_nums, size_t sz) { + unsigned char i, j; + unsigned fecnum; + gf* p; + + for (i=0; i= code->k); + memset(fecs[i], 0, sz); + p = &(code->enc_matrix[fecnum * code->k]); + for (j = 0; j < code->k; j++) + addmul(fecs[i], src[j], p[j], sz); + } +} + +/** + * Build decode matrix into some memory space. + * + * @param matrix a space allocated for a k by k matrix + */ +void +build_decode_matrix_into_space(const fec_t*restrict const code, const unsigned*const restrict index, const unsigned k, gf*restrict const matrix) { + unsigned char i; + gf* p; + for (i=0, p=matrix; i < k; i++, p += k) { + if (index[i] < k) { + memset(p, 0, k); + p[i] = 1; + } else { + memcpy(p, &(code->enc_matrix[index[i] * code->k]), k); + } + } + _invert_mat (matrix, k); +} + +void +fec_decode(const fec_t* code, const gf*restrict const*restrict const inpkts, gf*restrict const*restrict const outpkts, const unsigned*restrict const index, size_t sz) { + gf m_dec[code->k * code->k]; + build_decode_matrix_into_space(code, index, code->k, m_dec); + + unsigned char outix=0; + for (unsigned char row=0; rowk; row++) { + if (index[row] >= code->k) { + memset(outpkts[outix], 0, sz); + for (unsigned char col=0; col < code->k; col++) + addmul(outpkts[outix], inpkts[col], m_dec[row * code->k + col], sz); + outix++; + } + } +} diff --git a/zfec/fec/fec.h b/zfec/fec/fec.h new file mode 100644 index 0000000..360aed3 --- /dev/null +++ b/zfec/fec/fec.h @@ -0,0 +1,101 @@ +/** + * zfec -- fast forward error correction library with Python interface + * + * Copyright (C) 2007 Allmydata, Inc. + * Author: Zooko Wilcox-O'Hearn + * mailto:zooko@zooko.com + * + * This file is part of zfec. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. This program also comes with the added permission that, + * in the case that you are obligated to release a derived work under this + * licence (as per section 2.b of the GPL), you may delay the fulfillment of + * this obligation for up to 12 months. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +/* + * Much of this work is derived from the "fec" software by Luigi Rizzo, et + * al., the copyright notice and licence terms of which are included below + * for reference. + * + * fec.h -- forward error correction based on Vandermonde matrices + * 980614 + * (C) 1997-98 Luigi Rizzo (luigi@iet.unipi.it) + * + * Portions derived from code by Phil Karn (karn@ka9q.ampr.org), + * Robert Morelos-Zaragoza (robert@spectra.eng.hawaii.edu) and Hari + * Thirumoorthy (harit@spectra.eng.hawaii.edu), Aug 1995 + * + * Modifications by Dan Rubenstein (see Modifications.txt for + * their description. + * Modifications (C) 1998 Dan Rubenstein (drubenst@cs.umass.edu) + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS + * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, + * OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, + * OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY + * OF SUCH DAMAGE. + */ + +typedef unsigned char gf; + +typedef struct { + unsigned long magic; + unsigned k, n; /* parameters of the code */ + gf* enc_matrix; +} fec_t; + +/** + * param k the number of blocks required to reconstruct + * param m the total number of blocks created + */ +fec_t* fec_new(unsigned k, unsigned m); +void fec_free(fec_t* p); + +/** + * @param inpkts the "primary blocks" i.e. the chunks of the input data + * @param fecs buffers into which the secondary blocks will be written + * @param block_nums the numbers of the desired blocks -- including both primary blocks (the id < k) which fec_encode() ignores and check blocks (the id >= k) which fec_encode() will produce and store into the buffers of the fecs parameter + * @param num_block_nums the length of the block_nums array + */ +void fec_encode(const fec_t* code, const gf*restrict const*restrict const src, gf*restrict const*restrict const fecs, const unsigned*restrict const block_nums, size_t num_block_nums, size_t sz); + +/** + * @param inpkts an array of packets (size k) + * @param outpkts an array of buffers into which the reconstructed output packets will be written (only packets which are not present in the inpkts input will be reconstructed and written to outpkts) + * @param index an array of the blocknums of the packets in inpkts + * @param sz size of a packet in bytes + */ +void fec_decode(const fec_t* code, const gf*restrict const*restrict const inpkts, gf*restrict const*restrict const outpkts, const unsigned*restrict const index, size_t sz); + +/* end of file */ diff --git a/zfec/fec/filefec.py b/zfec/fec/filefec.py new file mode 100644 index 0000000..7a08a19 --- /dev/null +++ b/zfec/fec/filefec.py @@ -0,0 +1,436 @@ +# zfec -- fast forward error correction library with Python interface +# +# Copyright (C) 2007 Allmydata, Inc. +# Author: Zooko Wilcox-O'Hearn +# mailto:zooko@zooko.com +# +# This file is part of zfec. +# +# This program is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the Free +# Software Foundation; either version 2 of the License, or (at your option) +# any later version. This program also comes with the added permission that, +# in the case that you are obligated to release a derived work under this +# licence (as per section 2.b of the GPL), you may delay the fulfillment of +# this obligation for up to 12 months. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + +import easyfec, fec +from util import fileutil +from util.mathutil import log_ceil + +import array, os, re, struct, traceback + +CHUNKSIZE = 4096 + +class InsufficientShareFilesError(fec.Error): + def __init__(self, k, kb, *args, **kwargs): + fec.Error.__init__(self, *args, **kwargs) + self.k = k + self.kb = kb + + def __repr__(self): + return "Insufficient share files -- %d share files are required to recover this file, but only %d were given" % (self.k, self.kb,) + + def __str__(self): + return self.__repr__() + +class CorruptedShareFilesError(fec.Error): + pass + +def _build_header(m, k, pad, sh): + """ + @param m: the total number of shares; 3 <= m <= 256 + @param k: the number of shares required to reconstruct; 2 <= k < m + @param pad: the number of bytes of padding added to the file before encoding; 0 <= pad < k + @param sh: the shnum of this share; 0 <= k < m + + @return: a string (which is hopefully short) encoding m, k, sh, and pad + """ + assert m >= 3 + assert m <= 2**8 + assert k >= 2 + assert k < m + assert pad >= 0 + assert pad < k + + assert sh >= 0 + assert sh < m + + bitsused = 0 + val = 0 + + val |= (m - 3) + bitsused += 8 # the first 8 bits always encode m + + kbits = log_ceil(m-2, 2) # num bits needed to store all possible values of k + val <<= kbits + bitsused += kbits + + val |= (k - 2) + + padbits = log_ceil(k, 2) # num bits needed to store all possible values of pad + val <<= padbits + bitsused += padbits + + val |= pad + + shnumbits = log_ceil(m, 2) # num bits needed to store all possible values of shnum + val <<= shnumbits + bitsused += shnumbits + + val |= sh + + assert bitsused >= 11 + assert bitsused <= 32 + + if bitsused <= 16: + val <<= (16-bitsused) + cs = struct.pack('>H', val) + assert cs[:-2] == '\x00' * (len(cs)-2) + return cs[-2:] + if bitsused <= 24: + val <<= (24-bitsused) + cs = struct.pack('>I', val) + assert cs[:-3] == '\x00' * (len(cs)-3) + return cs[-3:] + else: + val <<= (32-bitsused) + cs = struct.pack('>I', val) + assert cs[:-4] == '\x00' * (len(cs)-4) + return cs[-4:] + +def MASK(bits): + return (1<> b2_bits_left) + 2 + + shbits = log_ceil(m, 2) # num bits needed to store all possible values of shnum + padbits = log_ceil(k, 2) # num bits needed to store all possible values of pad + + val = byte & (~kbitmask) + + needed_padbits = padbits - b2_bits_left + if needed_padbits > 0: + ch = inf.read(1) + if not ch: + raise CorruptedShareFilesError("Share files were corrupted -- share file %r didn't have a complete metadata header at the front. Perhaps the file was truncated." % (inf.name,)) + byte = struct.unpack(">B", ch)[0] + val <<= 8 + val |= byte + needed_padbits -= 8 + assert needed_padbits <= 0 + extrabits = -needed_padbits + pad = val >> extrabits + val &= MASK(extrabits) + + needed_shbits = shbits - extrabits + if needed_shbits > 0: + ch = inf.read(1) + if not ch: + raise CorruptedShareFilesError("Share files were corrupted -- share file %r didn't have a complete metadata header at the front. Perhaps the file was truncated." % (inf.name,)) + byte = struct.unpack(">B", ch)[0] + val <<= 8 + val |= byte + needed_shbits -= 8 + assert needed_shbits <= 0 + + gotshbits = -needed_shbits + + sh = val >> gotshbits + + return (m, k, pad, sh,) + +FORMAT_FORMAT = "%%s.%%0%dd_%%0%dd%%s" +RE_FORMAT = "%s.[0-9]+_[0-9]+%s" +def encode_to_files(inf, fsize, dirname, prefix, k, m, suffix=".fec", overwrite=False, verbose=False): + """ + Encode inf, writing the shares to specially named, newly created files. + + @param fsize: calling read() on inf must yield fsize bytes of data and + then raise an EOFError + @param dirname: the name of the directory into which the sharefiles will + be written + """ + mlen = len(str(m)) + format = FORMAT_FORMAT % (mlen, mlen,) + + padbytes = fec.util.mathutil.pad_size(fsize, k) + + fns = [] + fs = [] + try: + for shnum in range(m): + hdr = _build_header(m, k, padbytes, shnum) + + fn = os.path.join(dirname, format % (prefix, shnum, m, suffix,)) + if verbose: + print "Creating share file %r..." % (fn,) + if overwrite: + f = open(fn, "wb") + else: + fd = os.open(fn, os.O_WRONLY|os.O_CREAT|os.O_EXCL) + f = os.fdopen(fd, "wb") + f.write(hdr) + fs.append(f) + fns.append(fn) + sumlen = [0] + def cb(blocks, length): + assert len(blocks) == len(fs) + oldsumlen = sumlen[0] + sumlen[0] += length + if verbose: + if int((float(oldsumlen) / fsize) * 10) != int((float(sumlen[0]) / fsize) * 10): + print str(int((float(sumlen[0]) / fsize) * 10) * 10) + "% ...", + + if sumlen[0] > fsize: + raise IOError("Wrong file size -- possibly the size of the file changed during encoding. Original size: %d, observed size at least: %s" % (fsize, sumlen[0],)) + for i in range(len(blocks)): + data = blocks[i] + fs[i].write(data) + length -= len(data) + + encode_file_stringy_easyfec(inf, cb, k, m, chunksize=4096) + except EnvironmentError, le: + print "Cannot complete because of exception: " + print le + print "Cleaning up..." + # clean up + while fs: + f = fs.pop() + f.close() ; del f + fn = fns.pop() + if verbose: + print "Cleaning up: trying to remove %r..." % (fn,) + fileutil.remove_if_possible(fn) + return 1 + if verbose: + print + print "Done!" + return 0 + +# Note: if you really prefer base-2 and you change this code, then please +# denote 2^20 as "MiB" instead of "MB" in order to avoid ambiguity. +# Thanks. +# http://en.wikipedia.org/wiki/Megabyte +MILLION_BYTES=10**6 + +def decode_from_files(outf, infiles, verbose=False): + """ + Decode from the first k files in infiles, writing the results to outf. + """ + assert len(infiles) >= 2 + infs = [] + shnums = [] + m = None + k = None + padlen = None + + byteswritten = 0 + for f in infiles: + (nm, nk, npadlen, shnum,) = _parse_header(f) + if not (m is None or m == nm): + raise CorruptedShareFilesError("Share files were corrupted -- share file %r said that m was %s but another share file previously said that m was %s" % (f.name, nm, m,)) + m = nm + if not (k is None or k == nk): + raise CorruptedShareFilesError("Share files were corrupted -- share file %r said that k was %s but another share file previously said that k was %s" % (f.name, nk, k,)) + if k > len(infiles): + raise InsufficientShareFilesError(k, len(infiles)) + k = nk + if not (padlen is None or padlen == npadlen): + raise CorruptedShareFilesError("Share files were corrupted -- share file %r said that pad length was %s but another share file previously said that pad length was %s" % (f.name, npadlen, padlen,)) + padlen = npadlen + + infs.append(f) + shnums.append(shnum) + + if len(infs) == k: + break + + dec = easyfec.Decoder(k, m) + + while True: + chunks = [ inf.read(CHUNKSIZE) for inf in infs ] + if [ch for ch in chunks if len(ch) != len(chunks[-1])]: + raise CorruptedShareFilesError("Share files were corrupted -- all share files are required to be the same length, but they weren't.") + + if len(chunks[-1]) == CHUNKSIZE: + # Then this was a full read, so we're still in the sharefiles. + resultdata = dec.decode(chunks, shnums, padlen=0) + outf.write(resultdata) + byteswritten += len(resultdata) + if verbose: + if ((byteswritten - len(resultdata)) / (10*MILLION_BYTES)) != (byteswritten / (10*MILLION_BYTES)): + print str(byteswritten / MILLION_BYTES) + " MB ...", + else: + # Then this was a short read, so we've reached the end of the sharefiles. + resultdata = dec.decode(chunks, shnums, padlen) + outf.write(resultdata) + return # Done. + if verbose: + print + print "Done!" + +def encode_file(inf, cb, k, m, chunksize=4096): + """ + Read in the contents of inf, encode, and call cb with the results. + + First, k "input blocks" will be read from inf, each input block being of + size chunksize. Then these k blocks will be encoded into m "result + blocks". Then cb will be invoked, passing a list of the m result blocks + as its first argument, and the length of the encoded data as its second + argument. (The length of the encoded data is always equal to k*chunksize, + until the last iteration, when the end of the file has been reached and + less than k*chunksize bytes could be read from the file.) This procedure + is iterated until the end of the file is reached, in which case the space + of the input blocks that is unused is filled with zeroes before encoding. + + Note that the sequence passed in calls to cb() contains mutable array + objects in its first k elements whose contents will be overwritten when + the next segment is read from the input file. Therefore the + implementation of cb() has to either be finished with those first k arrays + before returning, or if it wants to keep the contents of those arrays for + subsequent use after it has returned then it must make a copy of them to + keep. + + @param inf the file object from which to read the data + @param cb the callback to be invoked with the results + @param k the number of shares required to reconstruct the file + @param m the total number of shares created + @param chunksize how much data to read from inf for each of the k input + blocks + """ + enc = fec.Encoder(k, m) + l = tuple([ array.array('c') for i in range(k) ]) + indatasize = k*chunksize # will be reset to shorter upon EOF + eof = False + ZEROES=array.array('c', ['\x00'])*chunksize + while not eof: + # This loop body executes once per segment. + i = 0 + while (i= 3: + return "%s:%s" % (len(x), b32encode(x[-3:]),) + elif len(x) == 2: + return "%s:%s" % (len(x), b32encode(x[-2:]),) + elif len(x) == 1: + return "%s:%s" % (len(x), b32encode(x[-1:]),) + elif len(x) == 0: + return "%s:%s" % (len(x), "--empty--",) + +def _h(k, m, ss): + encer = fec.Encoder(k, m) + nums_and_blocks = list(enumerate(encer.encode(ss))) + assert isinstance(nums_and_blocks, list), nums_and_blocks + assert len(nums_and_blocks) == m, (len(nums_and_blocks), m,) + nums_and_blocks = random.sample(nums_and_blocks, k) + blocks = [ x[1] for x in nums_and_blocks ] + nums = [ x[0] for x in nums_and_blocks ] + decer = fec.Decoder(k, m) + decoded = decer.decode(blocks, nums) + assert len(decoded) == len(ss), (len(decoded), len(ss),) + assert tuple([str(s) for s in decoded]) == tuple([str(s) for s in ss]), (tuple([ab(str(s)) for s in decoded]), tuple([ab(str(s)) for s in ss]),) + +def randstr(n): + return ''.join(map(chr, map(random.randrange, [0]*n, [256]*n))) + +def _help_test_random(): + m = random.randrange(1, 257) + k = random.randrange(1, m+1) + l = random.randrange(0, 2**10) + ss = [ randstr(l/k) for x in range(k) ] + _h(k, m, ss) + +def _help_test_random_with_l(l): + m = 83 + k = 19 + ss = [ randstr(l/k) for x in range(k) ] + _h(k, m, ss) + +class Fec(unittest.TestCase): + def test_random(self): + for i in range(3): + _help_test_random() + if VERBOSE: + print "%d randomized tests pass." % (i+1) + + def test_bad_args_enc(self): + encer = fec.Encoder(2, 4) + try: + encer.encode(["a", "b", ], ["c", "I am not an integer blocknum",]) + except fec.Error, e: + assert "Precondition violation: second argument is required to contain int" in str(e), e + else: + raise "Should have gotten fec.Error for wrong type of second argument." + + try: + encer.encode(["a", "b", ], 98) # not a sequence at all + except TypeError, e: + assert "Second argument (optional) was not a sequence" in str(e), e + else: + raise "Should have gotten TypeError for wrong type of second argument." + + def test_bad_args_dec(self): + decer = fec.Decoder(2, 4) + + try: + decer.decode(98, [0, 1]) # first argument is not a sequence + except TypeError, e: + assert "First argument was not a sequence" in str(e), e + else: + raise "Should have gotten TypeError for wrong type of second argument." + + try: + decer.decode(["a", "b", ], ["c", "d",]) + except fec.Error, e: + assert "Precondition violation: second argument is required to contain int" in str(e), e + else: + raise "Should have gotten fec.Error for wrong type of second argument." + + try: + decer.decode(["a", "b", ], 98) # not a sequence at all + except TypeError, e: + assert "Second argument was not a sequence" in str(e), e + else: + raise "Should have gotten TypeError for wrong type of second argument." + +class FileFec(unittest.TestCase): + def test_filefec_header(self): + for m in [3, 5, 7, 9, 11, 17, 19, 33, 35, 65, 66, 67, 129, 130, 131, 254, 255, 256,]: + for k in [2, 3, 5, 9, 17, 33, 65, 129, 255,]: + if k >= m: + continue + for pad in [0, 1, k-1,]: + if pad >= k: + continue + for sh in [0, 1, m-1,]: + if sh >= m: + continue + h = fec.filefec._build_header(m, k, pad, sh) + hio = cStringIO.StringIO(h) + (rm, rk, rpad, rsh,) = fec.filefec._parse_header(hio) + assert (rm, rk, rpad, rsh,) == (m, k, pad, sh,), h + + def _help_test_filefec(self, teststr, k, m, numshs=None): + if numshs == None: + numshs = m + + TESTFNAME = "testfile.txt" + PREFIX = "test" + SUFFIX = ".fec" + + tempdir = fec.util.fileutil.NamedTemporaryDirectory(cleanup=False) + try: + tempfn = os.path.join(tempdir.name, TESTFNAME) + tempf = open(tempfn, 'wb') + tempf.write(teststr) + tempf.close() + fsize = os.path.getsize(tempfn) + assert fsize == len(teststr) + + # encode the file + fec.filefec.encode_to_files(open(tempfn, 'rb'), fsize, tempdir.name, PREFIX, k, m, SUFFIX, verbose=VERBOSE) + + # select some share files + RE=re.compile(fec.filefec.RE_FORMAT % (PREFIX, SUFFIX,)) + fns = os.listdir(tempdir.name) + sharefs = [ open(os.path.join(tempdir.name, fn), "rb") for fn in fns if RE.match(fn) ] + random.shuffle(sharefs) + del sharefs[numshs:] + + # decode from the share files + outf = open(os.path.join(tempdir.name, 'recovered-testfile.txt'), 'wb') + fec.filefec.decode_from_files(outf, sharefs, verbose=VERBOSE) + outf.close() + + tempfn = open(os.path.join(tempdir.name, 'recovered-testfile.txt'), 'rb') + recovereddata = tempfn.read() + assert recovereddata == teststr + finally: + tempdir.shutdown() + + def test_filefec_all_shares(self): + return self._help_test_filefec("Yellow Whirled!", 3, 8) + + def test_filefec_all_shares_with_padding(self, noisy=VERBOSE): + return self._help_test_filefec("Yellow Whirled!A", 3, 8) + + def test_filefec_min_shares_with_padding(self, noisy=VERBOSE): + return self._help_test_filefec("Yellow Whirled!A", 3, 8, numshs=3) + +if __name__ == "__main__": + if hasattr(unittest, 'main'): + unittest.main() + else: + sys.path.append(os.getcwd()) + mods = [] + fullname = os.path.realpath(os.path.abspath(__file__)) + for pathel in sys.path: + fullnameofpathel = os.path.realpath(os.path.abspath(pathel)) + if fullname.startswith(fullnameofpathel): + relname = fullname[len(fullnameofpathel):] + mod = (os.path.splitext(relname)[0]).replace(os.sep, '.').strip('.') + mods.append(mod) + + mods.sort(cmp=lambda x, y: cmp(len(x), len(y))) + mods.reverse() + for mod in mods: + cmdstr = "trial %s %s" % (' '.join(sys.argv[1:]), mod) + print cmdstr + if os.system(cmdstr) == 0: + break diff --git a/zfec/fec/util/__init__.py b/zfec/fec/util/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/zfec/fec/util/argparse.py b/zfec/fec/util/argparse.py new file mode 100644 index 0000000..8b273b7 --- /dev/null +++ b/zfec/fec/util/argparse.py @@ -0,0 +1,1799 @@ +# -*- coding: utf-8 -*- + +# Copyright © 2006 Steven J. Bethard . +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted under the terms of the 3-clause BSD +# license. No warranty expressed or implied. + +"""Command-line parsing library + +This module is an optparse-inspired command-line parsing library that: + +* handles both optional and positional arguments +* produces highly informative usage messages +* supports parsers that dispatch to sub-parsers + +The following is a simple usage example that sums integers from the +command-line and writes the result to a file: + + parser = argparse.ArgumentParser( + description='sum the integers at the command line') + parser.add_argument( + 'integers', metavar='int', nargs='+', type=int, + help='an integer to be summed') + parser.add_argument( + '--log', default=sys.stdout, type=argparse.FileType('w'), + help='the file where the sum should be written') + args = parser.parse_args() + args.log.write('%s' % sum(args.integers)) + args.log.close() + +The module contains the following public classes: + + ArgumentParser -- The main entry point for command-line parsing. As the + example above shows, the add_argument() method is used to populate + the parser with actions for optional and positional arguments. Then + the parse_args() method is invoked to convert the args at the + command-line into an object with attributes. + + ArgumentError -- The exception raised by ArgumentParser objects when + there are errors with the parser's actions. Errors raised while + parsing the command-line are caught by ArgumentParser and emitted + as command-line messages. + + FileType -- A factory for defining types of files to be created. As the + example above shows, instances of FileType are typically passed as + the type= argument of add_argument() calls. + + Action -- The base class for parser actions. Typically actions are + selected by passing strings like 'store_true' or 'append_const' to + the action= argument of add_argument(). However, for greater + customization of ArgumentParser actions, subclasses of Action may + be defined and passed as the action= argument. + + HelpFormatter, RawDescriptionHelpFormatter -- Formatter classes which + may be passed as the formatter_class= argument to the + ArgumentParser constructor. HelpFormatter is the default, while + RawDescriptionHelpFormatter tells the parser not to perform any + line-wrapping on description text. + +All other classes in this module are considered implementation details. +(Also note that HelpFormatter and RawDescriptionHelpFormatter are only +considered public as object names -- the API of the formatter objects is +still considered an implementation detail.) +""" + +import os as _os +import re as _re +import sys as _sys +import textwrap as _textwrap + +from gettext import gettext as _ + +SUPPRESS = '==SUPPRESS==' + +OPTIONAL = '?' +ZERO_OR_MORE = '*' +ONE_OR_MORE = '+' +PARSER = '==PARSER==' + +# ============================= +# Utility functions and classes +# ============================= + +class _AttributeHolder(object): + """Abstract base class that provides __repr__. + + The __repr__ method returns a string in the format: + ClassName(attr=name, attr=name, ...) + The attributes are determined either by a class-level attribute, + '_kwarg_names', or by inspecting the instance __dict__. + """ + + def __repr__(self): + type_name = type(self).__name__ + arg_strings = [] + for arg in self._get_args(): + arg_strings.append(repr(arg)) + for name, value in self._get_kwargs(): + arg_strings.append('%s=%r' % (name, value)) + return '%s(%s)' % (type_name, ', '.join(arg_strings)) + + def _get_kwargs(self): + return sorted(self.__dict__.items()) + + def _get_args(self): + return [] + +def _ensure_value(namespace, name, value): + if getattr(namespace, name, None) is None: + setattr(namespace, name, value) + return getattr(namespace, name) + + + +# =============== +# Formatting Help +# =============== + +class HelpFormatter(object): + + def __init__(self, + prog, + indent_increment=2, + max_help_position=24, + width=None): + + # default setting for width + if width is None: + try: + width = int(_os.environ['COLUMNS']) + except (KeyError, ValueError): + width = 80 + width -= 2 + + self._prog = prog + self._indent_increment = indent_increment + self._max_help_position = max_help_position + self._width = width + + self._current_indent = 0 + self._level = 0 + self._action_max_length = 0 + + self._root_section = self._Section(self, None) + self._current_section = self._root_section + + self._whitespace_matcher = _re.compile(r'\s+') + self._long_break_matcher = _re.compile(r'\n\n\n+') + + # =============================== + # Section and indentation methods + # =============================== + + def _indent(self): + self._current_indent += self._indent_increment + self._level += 1 + + def _dedent(self): + self._current_indent -= self._indent_increment + assert self._current_indent >= 0, 'Indent decreased below 0.' + self._level -= 1 + + class _Section(object): + def __init__(self, formatter, parent, heading=None): + self.formatter = formatter + self.parent = parent + self.heading = heading + self.items = [] + + def format_help(self): + # format the indented section + if self.parent is not None: + self.formatter._indent() + join = self.formatter._join_parts + for func, args in self.items: + func(*args) + item_help = join(func(*args) for func, args in self.items) + if self.parent is not None: + self.formatter._dedent() + + # return nothing if the section was empty + if not item_help: + return '' + + # add the heading if the section was non-empty + if self.heading is not SUPPRESS and self.heading is not None: + current_indent = self.formatter._current_indent + heading = '%*s%s:\n' % (current_indent, '', self.heading) + else: + heading = '' + + # join the section-initial newline, the heading and the help + return join(['\n', heading, item_help, '\n']) + + def _add_item(self, func, args): + self._current_section.items.append((func, args)) + + # ======================== + # Message building methods + # ======================== + + def start_section(self, heading): + self._indent() + section = self._Section(self, self._current_section, heading) + self._add_item(section.format_help, []) + self._current_section = section + + def end_section(self): + self._current_section = self._current_section.parent + self._dedent() + + def add_text(self, text): + if text is not SUPPRESS and text is not None: + self._add_item(self._format_text, [text]) + + def add_usage(self, usage, optionals, positionals, prefix=None): + if usage is not SUPPRESS: + args = usage, optionals, positionals, prefix + self._add_item(self._format_usage, args) + + def add_argument(self, action): + if action.help is not SUPPRESS: + + # update the maximum item length + invocation = self._format_action_invocation(action) + action_length = len(invocation) + self._current_indent + self._action_max_length = max(self._action_max_length, + action_length) + + # add the item to the list + self._add_item(self._format_action, [action]) + + def add_arguments(self, actions): + for action in actions: + self.add_argument(action) + + # ======================= + # Help-formatting methods + # ======================= + + def format_help(self): + help = self._root_section.format_help() % dict(prog=self._prog) + if help: + help = self._long_break_matcher.sub('\n\n', help) + help = help.strip('\n') + '\n' + return help + + def _join_parts(self, part_strings): + return ''.join(part + for part in part_strings + if part and part is not SUPPRESS) + + def _format_usage(self, usage, optionals, positionals, prefix): + if prefix is None: + prefix = _('usage: ') + + # if no optionals or positionals are available, usage is just prog + if usage is None and not optionals and not positionals: + usage = '%(prog)s' + + # if optionals and positionals are available, calculate usage + elif usage is None: + usage = '%(prog)s' % dict(prog=self._prog) + + # determine width of "usage: PROG" and width of text + prefix_width = len(prefix) + len(usage) + 1 + prefix_indent = self._current_indent + prefix_width + text_width = self._width - self._current_indent + + # put them on one line if they're short enough + format = self._format_actions_usage + action_usage = format(optionals + positionals) + if prefix_width + len(action_usage) + 1 < text_width: + usage = '%s %s' % (usage, action_usage) + + # if they're long, wrap optionals and positionals individually + else: + optional_usage = format(optionals) + positional_usage = format(positionals) + indent = ' ' * prefix_indent + + # usage is made of PROG, optionals and positionals + parts = [usage, ' '] + + # options always get added right after PROG + if optional_usage: + parts.append(_textwrap.fill( + optional_usage, text_width, + initial_indent=indent, + subsequent_indent=indent).lstrip()) + + # if there were options, put arguments on the next line + # otherwise, start them right after PROG + if positional_usage: + part = _textwrap.fill( + positional_usage, text_width, + initial_indent=indent, + subsequent_indent=indent).lstrip() + if optional_usage: + part = '\n' + indent + part + parts.append(part) + usage = ''.join(parts) + + # prefix with 'usage:' + return '%s%s\n\n' % (prefix, usage) + + def _format_actions_usage(self, actions): + parts = [] + for action in actions: + if action.help is SUPPRESS: + continue + + # produce all arg strings + if not action.option_strings: + parts.append(self._format_args(action, action.dest)) + + # produce the first way to invoke the option in brackets + else: + option_string = action.option_strings[0] + + # if the Optional doesn't take a value, format is: + # -s or --long + if action.nargs == 0: + part = '%s' % option_string + + # if the Optional takes a value, format is: + # -s ARGS or --long ARGS + else: + default = action.dest.upper() + args_string = self._format_args(action, default) + part = '%s %s' % (option_string, args_string) + + # make it look optional if it's not required + if not action.required: + part = '[%s]' % part + parts.append(part) + + return ' '.join(parts) + + def _format_text(self, text): + text_width = self._width - self._current_indent + indent = ' ' * self._current_indent + return self._fill_text(text, text_width, indent) + '\n\n' + + def _format_action(self, action): + # determine the required width and the entry label + help_position = min(self._action_max_length + 2, + self._max_help_position) + help_width = self._width - help_position + action_width = help_position - self._current_indent - 2 + action_header = self._format_action_invocation(action) + + # ho nelp; start on same line and add a final newline + if not action.help: + tup = self._current_indent, '', action_header + action_header = '%*s%s\n' % tup + + # short action name; start on the same line and pad two spaces + elif len(action_header) <= action_width: + tup = self._current_indent, '', action_width, action_header + action_header = '%*s%-*s ' % tup + indent_first = 0 + + # long action name; start on the next line + else: + tup = self._current_indent, '', action_header + action_header = '%*s%s\n' % tup + indent_first = help_position + + # collect the pieces of the action help + parts = [action_header] + + # if there was help for the action, add lines of help text + if action.help: + help_text = self._expand_help(action) + help_lines = self._split_lines(help_text, help_width) + parts.append('%*s%s\n' % (indent_first, '', help_lines[0])) + for line in help_lines[1:]: + parts.append('%*s%s\n' % (help_position, '', line)) + + # or add a newline if the description doesn't end with one + elif not action_header.endswith('\n'): + parts.append('\n') + + # return a single string + return self._join_parts(parts) + + def _format_action_invocation(self, action): + if not action.option_strings: + return self._format_metavar(action, action.dest) + + else: + parts = [] + + # if the Optional doesn't take a value, format is: + # -s, --long + if action.nargs == 0: + parts.extend(action.option_strings) + + # if the Optional takes a value, format is: + # -s ARGS, --long ARGS + else: + default = action.dest.upper() + args_string = self._format_args(action, default) + for option_string in action.option_strings: + parts.append('%s %s' % (option_string, args_string)) + + return ', '.join(parts) + + def _format_metavar(self, action, default_metavar): + if action.metavar is not None: + name = action.metavar + elif action.choices is not None: + choice_strs = (str(choice) for choice in action.choices) + name = '{%s}' % ','.join(choice_strs) + else: + name = default_metavar + return name + + def _format_args(self, action, default_metavar): + name = self._format_metavar(action, default_metavar) + if action.nargs is None: + result = name + elif action.nargs == OPTIONAL: + result = '[%s]' % name + elif action.nargs == ZERO_OR_MORE: + result = '[%s [%s ...]]' % (name, name) + elif action.nargs == ONE_OR_MORE: + result = '%s [%s ...]' % (name, name) + elif action.nargs is PARSER: + result = '%s ...' % name + else: + result = ' '.join([name] * action.nargs) + return result + + def _expand_help(self, action): + params = dict(vars(action), prog=self._prog) + for name, value in params.items(): + if value is SUPPRESS: + del params[name] + if params.get('choices') is not None: + choices_str = ', '.join(str(c) for c in params['choices']) + params['choices'] = choices_str + return action.help % params + + def _split_lines(self, text, width): + text = self._whitespace_matcher.sub(' ', text).strip() + return _textwrap.wrap(text, width) + + def _fill_text(self, text, width, indent): + text = self._whitespace_matcher.sub(' ', text).strip() + return _textwrap.fill(text, width, initial_indent=indent, + subsequent_indent=indent) + +class RawDescriptionHelpFormatter(HelpFormatter): + + def _fill_text(self, text, width, indent): + return ''.join(indent + line for line in text.splitlines(True)) + +class RawTextHelpFormatter(RawDescriptionHelpFormatter): + + def _split_lines(self, text, width): + return text.splitlines() + +# ===================== +# Options and Arguments +# ===================== + +class ArgumentError(Exception): + """ArgumentError(message, argument) + + Raised whenever there was an error creating or using an argument + (optional or positional). + + The string value of this exception is the message, augmented with + information about the argument that caused it. + """ + + def __init__(self, argument, message): + if argument.option_strings: + self.argument_name = '/'.join(argument.option_strings) + elif argument.metavar not in (None, SUPPRESS): + self.argument_name = argument.metavar + elif argument.dest not in (None, SUPPRESS): + self.argument_name = argument.dest + else: + self.argument_name = None + self.message = message + + def __str__(self): + if self.argument_name is None: + format = '%(message)s' + else: + format = 'argument %(argument_name)s: %(message)s' + return format % dict(message=self.message, + argument_name=self.argument_name) + +# ============== +# Action classes +# ============== + +class Action(_AttributeHolder): + """Action(*strings, **options) + + Action objects hold the information necessary to convert a + set of command-line arguments (possibly including an initial option + string) into the desired Python object(s). + + Keyword Arguments: + + option_strings -- A list of command-line option strings which + should be associated with this action. + + dest -- The name of the attribute to hold the created object(s) + + nargs -- The number of command-line arguments that should be consumed. + By default, one argument will be consumed and a single value will + be produced. Other values include: + * N (an integer) consumes N arguments (and produces a list) + * '?' consumes zero or one arguments + * '*' consumes zero or more arguments (and produces a list) + * '+' consumes one or more arguments (and produces a list) + Note that the difference between the default and nargs=1 is that + with the default, a single value will be produced, while with + nargs=1, a list containing a single value will be produced. + + const -- The value to be produced if the option is specified and the + option uses an action that takes no values. + + default -- The value to be produced if the option is not specified. + + type -- The type which the command-line arguments should be converted + to, should be one of 'string', 'int', 'float', 'complex' or a + callable object that accepts a single string argument. If None, + 'string' is assumed. + + choices -- A container of values that should be allowed. If not None, + after a command-line argument has been converted to the appropriate + type, an exception will be raised if it is not a member of this + collection. + + required -- True if the action must always be specified at the command + line. This is only meaningful for optional command-line arguments. + + help -- The help string describing the argument. + + metavar -- The name to be used for the option's argument with the help + string. If None, the 'dest' value will be used as the name. + """ + + + def __init__(self, + option_strings, + dest, + nargs=None, + const=None, + default=None, + type=None, + choices=None, + required=False, + help=None, + metavar=None): + self.option_strings = option_strings + self.dest = dest + self.nargs = nargs + self.const = const + self.default = default + self.type = type + self.choices = choices + self.required = required + self.help = help + self.metavar = metavar + + def _get_kwargs(self): + names = [ + 'option_strings', + 'dest', + 'nargs', + 'const', + 'default', + 'type', + 'choices', + 'help', + 'metavar' + ] + return [(name, getattr(self, name)) for name in names] + + def __call__(self, parser, namespace, values, option_string=None): + raise NotImplementedError(_('.__call__() not defined')) + +class _StoreAction(Action): + def __init__(self, + option_strings, + dest, + nargs=None, + const=None, + default=None, + type=None, + choices=None, + required=False, + help=None, + metavar=None): + if nargs == 0: + raise ValueError('nargs must be > 0') + if const is not None and nargs != OPTIONAL: + raise ValueError('nargs must be %r to supply const' % OPTIONAL) + super(_StoreAction, self).__init__( + option_strings=option_strings, + dest=dest, + nargs=nargs, + const=const, + default=default, + type=type, + choices=choices, + required=required, + help=help, + metavar=metavar) + + def __call__(self, parser, namespace, values, option_string=None): + setattr(namespace, self.dest, values) + +class _StoreConstAction(Action): + def __init__(self, + option_strings, + dest, + const, + default=None, + required=False, + help=None, + metavar=None): + super(_StoreConstAction, self).__init__( + option_strings=option_strings, + dest=dest, + nargs=0, + const=const, + default=default, + required=required, + help=help) + + def __call__(self, parser, namespace, values, option_string=None): + setattr(namespace, self.dest, self.const) + +class _StoreTrueAction(_StoreConstAction): + def __init__(self, + option_strings, + dest, + default=False, + required=False, + help=None): + super(_StoreTrueAction, self).__init__( + option_strings=option_strings, + dest=dest, + const=True, + default=default, + required=required, + help=help) + +class _StoreFalseAction(_StoreConstAction): + def __init__(self, + option_strings, + dest, + default=True, + required=False, + help=None): + super(_StoreFalseAction, self).__init__( + option_strings=option_strings, + dest=dest, + const=False, + default=default, + required=required, + help=help) + +class _AppendAction(Action): + def __init__(self, + option_strings, + dest, + nargs=None, + const=None, + default=None, + type=None, + choices=None, + required=False, + help=None, + metavar=None): + if nargs == 0: + raise ValueError('nargs must be > 0') + if const is not None and nargs != OPTIONAL: + raise ValueError('nargs must be %r to supply const' % OPTIONAL) + super(_AppendAction, self).__init__( + option_strings=option_strings, + dest=dest, + nargs=nargs, + const=const, + default=default, + type=type, + choices=choices, + required=required, + help=help, + metavar=metavar) + + def __call__(self, parser, namespace, values, option_string=None): + _ensure_value(namespace, self.dest, []).append(values) + +class _AppendConstAction(Action): + def __init__(self, + option_strings, + dest, + const, + default=None, + required=False, + help=None, + metavar=None): + super(_AppendConstAction, self).__init__( + option_strings=option_strings, + dest=dest, + nargs=0, + const=const, + default=default, + required=required, + help=help, + metavar=metavar) + + def __call__(self, parser, namespace, values, option_string=None): + _ensure_value(namespace, self.dest, []).append(self.const) + +class _CountAction(Action): + def __init__(self, + option_strings, + dest, + default=None, + required=False, + help=None): + super(_CountAction, self).__init__( + option_strings=option_strings, + dest=dest, + nargs=0, + default=default, + required=required, + help=help) + + def __call__(self, parser, namespace, values, option_string=None): + new_count = _ensure_value(namespace, self.dest, 0) + 1 + setattr(namespace, self.dest, new_count) + +class _HelpAction(Action): + def __init__(self, + option_strings, + dest, + help=None): + super(_HelpAction, self).__init__( + option_strings=option_strings, + dest=SUPPRESS, + nargs=0, + help=help) + + def __call__(self, parser, namespace, values, option_string=None): + parser.print_help() + parser.exit() + +class _VersionAction(Action): + def __init__(self, + option_strings, + dest, + help=None): + super(_VersionAction, self).__init__( + option_strings=option_strings, + dest=SUPPRESS, + nargs=0, + help=help) + + def __call__(self, parser, namespace, values, option_string=None): + parser.print_version() + parser.exit() + +class _SubParsersAction(Action): + + def __init__(self, + option_strings, + prog, + parser_class, + dest=SUPPRESS, + help=None, + metavar=None): + + self._prog_prefix = prog + self._parser_class = parser_class + self._name_parser_map = {} + + super(_SubParsersAction, self).__init__( + option_strings=option_strings, + dest=dest, + nargs=PARSER, + choices=self._name_parser_map, + help=help, + metavar=metavar) + + def add_parser(self, name, **kwargs): + if kwargs.get('prog') is None: + kwargs['prog'] = '%s %s' % (self._prog_prefix, name) + + parser = self._parser_class(**kwargs) + self._name_parser_map[name] = parser + return parser + + def __call__(self, parser, namespace, values, option_string=None): + parser_name = values[0] + arg_strings = values[1:] + + # set the parser name if requested + if self.dest is not SUPPRESS: + setattr(namespace, self.dest, parser_name) + + # select the parser + try: + parser = self._name_parser_map[parser_name] + except KeyError: + tup = parser_name, ', '.join(self._name_parser_map) + msg = _('unknown parser %r (choices: %s)' % tup) + raise ArgumentError(self, msg) + + # parse all the remaining options into the namespace + parser.parse_args(arg_strings, namespace) + + +# ============== +# Type classes +# ============== + +class FileType(object): + """Factory for creating file object types + + Instances of FileType are typically passed as type= arguments to the + ArgumentParser add_argument() method. + + Keyword Arguments: + mode -- A string indicating how the file is to be opened. Accepts the + same values as the builtin open() function. + bufsize -- The file's desired buffer size. Accepts the same values as + the builtin open() function. + exclusiveopen -- A bool indicating whether the attempt to create the file + should fail if there is already a file present by that name. This is + ignored if 'w' is not in mode. + """ + def __init__(self, mode='r', bufsize=None, exclusivecreate=False): + self._mode = mode + self._bufsize = bufsize + if self._bufsize is None: + self._bufsize = -1 + self._exclusivecreate = exclusivecreate + + def __call__(self, string): + # the special argument "-" means sys.std{in,out} + if string == '-': + if self._mode == 'r': + return _sys.stdin + elif self._mode == 'w': + return _sys.stdout + else: + msg = _('argument "-" with mode %r' % self._mode) + raise ValueError(msg) + + # all other arguments are used as file names + if self._exclusivecreate and ('w' in self._mode): + fd = _os.open(string, _os.O_CREAT|_os.O_EXCL) + return _os.fdopen(fd, self._mode, self._bufsize) + else: + return open(string, self._mode, self._bufsize) + + +# =========================== +# Optional and Positional Parsing +# =========================== + +class Namespace(_AttributeHolder): + + def __init__(self, **kwargs): + for name, value in kwargs.iteritems(): + setattr(self, name, value) + + def __eq__(self, other): + return vars(self) == vars(other) + + def __ne__(self, other): + return not (self == other) + + +class _ActionsContainer(object): + def __init__(self, + description, + conflict_handler): + superinit = super(_ActionsContainer, self).__init__ + superinit(description=description) + + self.description = description + self.conflict_handler = conflict_handler + + # set up registries + self._registries = {} + + # register actions + self.register('action', None, _StoreAction) + self.register('action', 'store', _StoreAction) + self.register('action', 'store_const', _StoreConstAction) + self.register('action', 'store_true', _StoreTrueAction) + self.register('action', 'store_false', _StoreFalseAction) + self.register('action', 'append', _AppendAction) + self.register('action', 'append_const', _AppendConstAction) + self.register('action', 'count', _CountAction) + self.register('action', 'help', _HelpAction) + self.register('action', 'version', _VersionAction) + self.register('action', 'parsers', _SubParsersAction) + + # raise an exception if the conflict handler is invalid + self._get_handler() + + # action storage + self._optional_actions_list = [] + self._positional_actions_list = [] + self._positional_actions_full_list = [] + self._option_strings = {} + + # ==================== + # Registration methods + # ==================== + + def register(self, registry_name, value, object): + registry = self._registries.setdefault(registry_name, {}) + registry[value] = object + + def _registry_get(self, registry_name, value, default=None): + return self._registries[registry_name].get(value, default) + + # ======================= + # Adding argument actions + # ======================= + + def add_argument(self, *args, **kwargs): + """ + add_argument(dest, ..., name=value, ...) + add_argument(option_string, option_string, ..., name=value, ...) + """ + + # type='outfile' is deprecated + if kwargs.get('type') == 'outfile': + import warnings + msg = _("use type=FileType('w') instead of type='outfile'") + warnings.warn(msg, DeprecationWarning) + + # if no positional args are supplied or only one is supplied and + # it doesn't look like an option string, parse a positional + # argument + if not args or len(args) == 1 and args[0][0] != '-': + kwargs = self._get_positional_kwargs(*args, **kwargs) + + # otherwise, we're adding an optional argument + else: + kwargs = self._get_optional_kwargs(*args, **kwargs) + + # create the action object, and add it to the parser + action_class = self._pop_action_class(kwargs) + action = action_class(**kwargs) + return self._add_action(action) + + def _add_action(self, action): + # resolve any conflicts + self._check_conflict(action) + + # add to optional or positional list + if action.option_strings: + self._optional_actions_list.append(action) + else: + self._positional_actions_list.append(action) + self._positional_actions_full_list.append(action) + action.container = self + + # index the action by any option strings it has + for option_string in action.option_strings: + self._option_strings[option_string] = action + + # return the created action + return action + + def _add_container_actions(self, container): + for action in container._optional_actions_list: + self._add_action(action) + for action in container._positional_actions_list: + self._add_action(action) + + def _get_positional_kwargs(self, dest, **kwargs): + # make sure required is not specified + if 'required' in kwargs: + msg = _("'required' is an invalid argument for positionals") + raise TypeError(msg) + + # return the keyword arguments with no option strings + return dict(kwargs, dest=dest, option_strings=[]) + + def _get_optional_kwargs(self, *args, **kwargs): + # determine short and long option strings + option_strings = [] + long_option_strings = [] + for option_string in args: + # error on one-or-fewer-character option strings + if len(option_string) < 2: + msg = _('invalid option string %r: ' + 'must be at least two characters long') + raise ValueError(msg % option_string) + + # error on strings that don't start with '-' + if not option_string.startswith('-'): + msg = _('invalid option string %r: ' + 'does not start with "-"') + raise ValueError(msg % option_string) + + # error on strings that are all '-'s + if not option_string.replace('-', ''): + msg = _('invalid option string %r: ' + 'must contain characters other than "-"') + raise ValueError(msg % option_string) + + # strings starting with '--' are long options + option_strings.append(option_string) + if option_string.startswith('--'): + long_option_strings.append(option_string) + + # infer destination, '--foo-bar' -> 'foo_bar' and '-x' -> 'x' + dest = kwargs.pop('dest', None) + if dest is None: + if long_option_strings: + dest_option_string = long_option_strings[0] + else: + dest_option_string = option_strings[0] + dest = dest_option_string.lstrip('-').replace('-', '_') + + # return the updated keyword arguments + return dict(kwargs, dest=dest, option_strings=option_strings) + + def _pop_action_class(self, kwargs, default=None): + action = kwargs.pop('action', default) + return self._registry_get('action', action, action) + + def _get_handler(self): + # determine function from conflict handler string + handler_func_name = '_handle_conflict_%s' % self.conflict_handler + try: + return getattr(self, handler_func_name) + except AttributeError: + msg = _('invalid conflict_resolution value: %r') + raise ValueError(msg % self.conflict_handler) + + def _check_conflict(self, action): + + # find all options that conflict with this option + confl_optionals = [] + for option_string in action.option_strings: + if option_string in self._option_strings: + confl_optional = self._option_strings[option_string] + confl_optionals.append((option_string, confl_optional)) + + # resolve any conflicts + if confl_optionals: + conflict_handler = self._get_handler() + conflict_handler(action, confl_optionals) + + def _handle_conflict_error(self, action, conflicting_actions): + message = _('conflicting option string(s): %s') + conflict_string = ', '.join(option_string + for option_string, action + in conflicting_actions) + raise ArgumentError(action, message % conflict_string) + + def _handle_conflict_resolve(self, action, conflicting_actions): + + # remove all conflicting options + for option_string, action in conflicting_actions: + + # remove the conflicting option + action.option_strings.remove(option_string) + self._option_strings.pop(option_string, None) + + # if the option now has no option string, remove it from the + # container holding it + if not action.option_strings: + action.container._optional_actions_list.remove(action) + + +class _ArgumentGroup(_ActionsContainer): + + def __init__(self, container, title=None, description=None, **kwargs): + # add any missing keyword arguments by checking the container + update = kwargs.setdefault + update('conflict_handler', container.conflict_handler) + superinit = super(_ArgumentGroup, self).__init__ + superinit(description=description, **kwargs) + + self.title = title + self._registries = container._registries + self._positional_actions_full_list = container._positional_actions_full_list + self._option_strings = container._option_strings + + +class ArgumentParser(_AttributeHolder, _ActionsContainer): + + def __init__(self, + prog=None, + usage=None, + description=None, + epilog=None, + version=None, + parents=[], + formatter_class=HelpFormatter, + conflict_handler='error', + add_help=True): + + superinit = super(ArgumentParser, self).__init__ + superinit(description=description, + conflict_handler=conflict_handler) + + # default setting for prog + if prog is None: + prog = _os.path.basename(_sys.argv[0]) + + self.prog = prog + self.usage = usage + self.epilog = epilog + self.version = version + self.formatter_class = formatter_class + self.add_help = add_help + + self._argument_group_class = _ArgumentGroup + self._has_subparsers = False + self._argument_groups = [] + self._defaults = {} + + # register types + def identity(string): + return string + def outfile(string): + if string == '-': + return _sys.stdout + else: + return open(string, 'w') + self.register('type', None, identity) + self.register('type', 'outfile', outfile) + + # add help and version arguments if necessary + if self.add_help: + self._add_help_argument() + if self.version: + self._add_version_argument() + + # add parent arguments and defaults + for parent in parents: + self._add_container_actions(parent) + try: + defaults = parent._defaults + except AttributeError: + pass + else: + self._defaults.update(defaults) + + # determines whether an "option" looks like a negative number + self._negative_number_matcher = _re.compile(r'^-\d+|-\d*.\d+$') + + + # ======================= + # Pretty __repr__ methods + # ======================= + + def _get_kwargs(self): + names = [ + 'prog', + 'usage', + 'description', + 'version', + 'formatter_class', + 'conflict_handler', + 'add_help', + ] + return [(name, getattr(self, name)) for name in names] + + # ================================== + # Namespace default settings methods + # ================================== + + def set_defaults(self, **kwargs): + self._defaults.update(kwargs) + + # ================================== + # Optional/Positional adding methods + # ================================== + + def add_argument_group(self, *args, **kwargs): + group = self._argument_group_class(self, *args, **kwargs) + self._argument_groups.append(group) + return group + + def add_subparsers(self, **kwargs): + if self._has_subparsers: + self.error(_('cannot have multiple subparser arguments')) + + # add the parser class to the arguments if it's not present + kwargs.setdefault('parser_class', type(self)) + + # prog defaults to the usage message of this parser, skipping + # optional arguments and with no "usage:" prefix + if kwargs.get('prog') is None: + formatter = self._get_formatter() + formatter.add_usage(self.usage, [], + self._get_positional_actions(), '') + kwargs['prog'] = formatter.format_help().strip() + + # create the parsers action and add it to the positionals list + parsers_class = self._pop_action_class(kwargs, 'parsers') + action = parsers_class(option_strings=[], **kwargs) + self._positional_actions_list.append(action) + self._positional_actions_full_list.append(action) + self._has_subparsers = True + + # return the created parsers action + return action + + def _add_container_actions(self, container): + super(ArgumentParser, self)._add_container_actions(container) + try: + groups = container._argument_groups + except AttributeError: + pass + else: + for group in groups: + new_group = self.add_argument_group( + title=group.title, + description=group.description, + conflict_handler=group.conflict_handler) + new_group._add_container_actions(group) + + def _get_optional_actions(self): + actions = [] + actions.extend(self._optional_actions_list) + for argument_group in self._argument_groups: + actions.extend(argument_group._optional_actions_list) + return actions + + def _get_positional_actions(self): + return list(self._positional_actions_full_list) + + def _add_help_argument(self): + self.add_argument('-h', '--help', action='help', + help=_('show this help message and exit')) + + def _add_version_argument(self): + self.add_argument('-v', '--version', action='version', + help=_("show program's version number and exit")) + + + # ===================================== + # Command line argument parsing methods + # ===================================== + + def parse_args(self, args=None, namespace=None): + # args default to the system args + if args is None: + args = _sys.argv[1:] + + # default Namespace built from parser defaults + if namespace is None: + namespace = Namespace() + + # add any action defaults that aren't present + optional_actions = self._get_optional_actions() + positional_actions = self._get_positional_actions() + for action in optional_actions + positional_actions: + if action.dest is not SUPPRESS: + if not hasattr(namespace, action.dest): + if action.default is not SUPPRESS: + default = action.default + if isinstance(action.default, basestring): + default = self._get_value(action, default) + setattr(namespace, action.dest, default) + + # add any parser defaults that aren't present + for dest, value in self._defaults.iteritems(): + if not hasattr(namespace, dest): + setattr(namespace, dest, value) + + # parse the arguments and exit if there are any errors + try: + result = self._parse_args(args, namespace) + except ArgumentError, err: + self.error(str(err)) + + # make sure all required optionals are present + for action in self._get_optional_actions(): + if action.required: + if getattr(result, action.dest, None) is None: + opt_strs = '/'.join(action.option_strings) + msg = _('option %s is required' % opt_strs) + self.error(msg) + + # return the parsed arguments + return result + + def _parse_args(self, arg_strings, namespace): + + # find all option indices, and determine the arg_string_pattern + # which has an 'O' if there is an option at an index, + # an 'A' if there is an argument, or a '-' if there is a '--' + option_string_indices = {} + arg_string_pattern_parts = [] + arg_strings_iter = iter(arg_strings) + for i, arg_string in enumerate(arg_strings_iter): + + # all args after -- are non-options + if arg_string == '--': + arg_string_pattern_parts.append('-') + for arg_string in arg_strings_iter: + arg_string_pattern_parts.append('A') + + # otherwise, add the arg to the arg strings + # and note the index if it was an option + else: + option_tuple = self._parse_optional(arg_string) + if option_tuple is None: + pattern = 'A' + else: + option_string_indices[i] = option_tuple + pattern = 'O' + arg_string_pattern_parts.append(pattern) + + # join the pieces together to form the pattern + arg_strings_pattern = ''.join(arg_string_pattern_parts) + + # converts arg strings to the appropriate and then takes the action + def take_action(action, argument_strings, option_string=None): + argument_values = self._get_values(action, argument_strings) + action(self, namespace, argument_values, option_string) + + # function to convert arg_strings into an optional action + def consume_optional(start_index): + + # determine the optional action and parse any explicit + # argument out of the option string + option_tuple = option_string_indices[start_index] + action, option_string, explicit_arg = option_tuple + + # loop because single-dash options can be chained + # (e.g. -xyz is the same as -x -y -z if no args are required) + match_argument = self._match_argument + action_tuples = [] + while True: + + # if we found no optional action, raise an error + if action is None: + self.error(_('no such option: %s') % option_string) + + # if there is an explicit argument, try to match the + # optional's string arguments to only this + if explicit_arg is not None: + arg_count = match_argument(action, 'A') + + # if the action is a single-dash option and takes no + # arguments, try to parse more single-dash options out + # of the tail of the option string + if arg_count == 0 and option_string[1] != '-': + action_tuples.append((action, [], option_string)) + option_string = '-' + explicit_arg + option_tuple = self._parse_optional(option_string) + if option_tuple[0] is None: + msg = _('ignored explicit argument %r') + raise ArgumentError(action, msg % explicit_arg) + + # set the action, etc. for the next loop iteration + action, option_string, explicit_arg = option_tuple + + # if the action expect exactly one argument, we've + # successfully matched the option; exit the loop + elif arg_count == 1: + stop = start_index + 1 + args = [explicit_arg] + action_tuples.append((action, args, option_string)) + break + + # error if a double-dash option did not use the + # explicit argument + else: + msg = _('ignored explicit argument %r') + raise ArgumentError(action, msg % explicit_arg) + + # if there is no explicit argument, try to match the + # optional's string arguments with the following strings + # if successful, exit the loop + else: + start = start_index + 1 + selected_patterns = arg_strings_pattern[start:] + arg_count = match_argument(action, selected_patterns) + stop = start + arg_count + args = arg_strings[start:stop] + action_tuples.append((action, args, option_string)) + break + + # add the Optional to the list and return the index at which + # the Optional's string args stopped + assert action_tuples + for action, args, option_string in action_tuples: + take_action(action, args, option_string) + return stop + + # the list of Positionals left to be parsed; this is modified + # by consume_positionals() + positionals = self._get_positional_actions() + + # function to convert arg_strings into positional actions + def consume_positionals(start_index): + # match as many Positionals as possible + match_partial = self._match_arguments_partial + selected_pattern = arg_strings_pattern[start_index:] + arg_counts = match_partial(positionals, selected_pattern) + + # slice off the appropriate arg strings for each Positional + # and add the Positional and its args to the list + for action, arg_count in zip(positionals, arg_counts): + args = arg_strings[start_index: start_index + arg_count] + start_index += arg_count + take_action(action, args) + + # slice off the Positionals that we just parsed and return the + # index at which the Positionals' string args stopped + positionals[:] = positionals[len(arg_counts):] + return start_index + + # consume Positionals and Optionals alternately, until we have + # passed the last option string + start_index = 0 + if option_string_indices: + max_option_string_index = max(option_string_indices) + else: + max_option_string_index = -1 + while start_index <= max_option_string_index: + + # consume any Positionals preceding the next option + next_option_string_index = min( + index + for index in option_string_indices + if index >= start_index) + if start_index != next_option_string_index: + positionals_end_index = consume_positionals(start_index) + + # only try to parse the next optional if we didn't consume + # the option string during the positionals parsing + if positionals_end_index > start_index: + start_index = positionals_end_index + continue + else: + start_index = positionals_end_index + + # if we consumed all the positionals we could and we're not + # at the index of an option string, there were unparseable + # arguments + if start_index not in option_string_indices: + msg = _('extra arguments found: %s') + extras = arg_strings[start_index:next_option_string_index] + self.error(msg % ' '.join(extras)) + + # consume the next optional and any arguments for it + start_index = consume_optional(start_index) + + # consume any positionals following the last Optional + stop_index = consume_positionals(start_index) + + # if we didn't consume all the argument strings, there were too + # many supplied + if stop_index != len(arg_strings): + extras = arg_strings[stop_index:] + self.error(_('extra arguments found: %s') % ' '.join(extras)) + + # if we didn't use all the Positional objects, there were too few + # arg strings supplied. + if positionals: + self.error(_('too few arguments')) + + # return the updated namespace + return namespace + + def _match_argument(self, action, arg_strings_pattern): + # match the pattern for this action to the arg strings + nargs_pattern = self._get_nargs_pattern(action) + match = _re.match(nargs_pattern, arg_strings_pattern) + + # raise an exception if we weren't able to find a match + if match is None: + nargs_errors = { + None:_('expected one argument'), + OPTIONAL:_('expected at most one argument'), + ONE_OR_MORE:_('expected at least one argument') + } + default = _('expected %s argument(s)') % action.nargs + msg = nargs_errors.get(action.nargs, default) + raise ArgumentError(action, msg) + + # return the number of arguments matched + return len(match.group(1)) + + def _match_arguments_partial(self, actions, arg_strings_pattern): + # progressively shorten the actions list by slicing off the + # final actions until we find a match + result = [] + for i in xrange(len(actions), 0, -1): + actions_slice = actions[:i] + pattern = ''.join(self._get_nargs_pattern(action) + for action in actions_slice) + match = _re.match(pattern, arg_strings_pattern) + if match is not None: + result.extend(len(string) for string in match.groups()) + break + + # return the list of arg string counts + return result + + def _parse_optional(self, arg_string): + # if it doesn't start with a '-', it was meant to be positional + if not arg_string.startswith('-'): + return None + + # if it's just dashes, it was meant to be positional + if not arg_string.strip('-'): + return None + + # if the option string is present in the parser, return the action + if arg_string in self._option_strings: + action = self._option_strings[arg_string] + return action, arg_string, None + + # search through all possible prefixes of the option string + # and all actions in the parser for possible interpretations + option_tuples = [] + prefix_tuples = self._get_option_prefix_tuples(arg_string) + for option_string in self._option_strings: + for option_prefix, explicit_arg in prefix_tuples: + if option_string.startswith(option_prefix): + action = self._option_strings[option_string] + tup = action, option_string, explicit_arg + option_tuples.append(tup) + break + + # if multiple actions match, the option string was ambiguous + if len(option_tuples) > 1: + options = ', '.join(opt_str for _, opt_str, _ in option_tuples) + tup = arg_string, options + self.error(_('ambiguous option: %s could match %s') % tup) + + # if exactly one action matched, this segmentation is good, + # so return the parsed action + elif len(option_tuples) == 1: + option_tuple, = option_tuples + return option_tuple + + # if it was not found as an option, but it looks like a negative + # number, it was meant to be positional + if self._negative_number_matcher.match(arg_string): + return None + + # it was meant to be an optional but there is no such option + # in this parser (though it might be a valid option in a subparser) + return None, arg_string, None + + def _get_option_prefix_tuples(self, option_string): + result = [] + + # option strings starting with '--' are only split at the '=' + if option_string.startswith('--'): + if '=' in option_string: + option_prefix, explicit_arg = option_string.split('=', 1) + else: + option_prefix = option_string + explicit_arg = None + tup = option_prefix, explicit_arg + result.append(tup) + + # option strings starting with '-' are split at all indices + else: + for first_index, char in enumerate(option_string): + if char != '-': + break + for i in xrange(len(option_string), first_index, -1): + tup = option_string[:i], option_string[i:] or None + result.append(tup) + + # return the collected prefix tuples + return result + + def _get_nargs_pattern(self, action): + # in all examples below, we have to allow for '--' args + # which are represented as '-' in the pattern + nargs = action.nargs + + # the default (None) is assumed to be a single argument + if nargs is None: + nargs_pattern = '(-*A-*)' + + # allow zero or one arguments + elif nargs == OPTIONAL: + nargs_pattern = '(-*A?-*)' + + # allow zero or more arguments + elif nargs == ZERO_OR_MORE: + nargs_pattern = '(-*[A-]*)' + + # allow one or more arguments + elif nargs == ONE_OR_MORE: + nargs_pattern = '(-*A[A-]*)' + + # allow one argument followed by any number of options or arguments + elif nargs is PARSER: + nargs_pattern = '(-*A[-AO]*)' + + # all others should be integers + else: + nargs_pattern = '(-*%s-*)' % '-*'.join('A' * nargs) + + # if this is an optional action, -- is not allowed + if action.option_strings: + nargs_pattern = nargs_pattern.replace('-*', '') + nargs_pattern = nargs_pattern.replace('-', '') + + # return the pattern + return nargs_pattern + + # ======================== + # Value conversion methods + # ======================== + + def _get_values(self, action, arg_strings): + # for everything but PARSER args, strip out '--' + if action.nargs is not PARSER: + arg_strings = [s for s in arg_strings if s != '--'] + + # optional argument produces a default when not present + if not arg_strings and action.nargs == OPTIONAL: + if action.option_strings: + value = action.const + else: + value = action.default + if isinstance(value, basestring): + value = self._get_value(action, value) + self._check_value(action, value) + + # when nargs='*' on a positional, if there were no command-line + # args, use the default if it is anything other than None + elif (not arg_strings and action.nargs == ZERO_OR_MORE and + not action.option_strings): + if action.default is not None: + value = action.default + else: + value = arg_strings + self._check_value(action, value) + + # single argument or optional argument produces a single value + elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]: + arg_string, = arg_strings + value = self._get_value(action, arg_string) + self._check_value(action, value) + + # PARSER arguments convert all values, but check only the first + elif action.nargs is PARSER: + value = list(self._get_value(action, v) for v in arg_strings) + self._check_value(action, value[0]) + + # all other types of nargs produce a list + else: + value = list(self._get_value(action, v) for v in arg_strings) + for v in value: + self._check_value(action, v) + + # return the converted value + return value + + def _get_value(self, action, arg_string): + type_func = self._registry_get('type', action.type, action.type) + if not callable(type_func): + msg = _('%r is not callable') + raise ArgumentError(action, msg % type_func) + + # convert the value to the appropriate type + try: + result = type_func(arg_string) + + # TypeErrors or ValueErrors indicate errors + except (TypeError, ValueError): + name = getattr(action.type, '__name__', repr(action.type)) + msg = _('invalid %s value: %r') + raise ArgumentError(action, msg % (name, arg_string)) + + # return the converted value + return result + + def _check_value(self, action, value): + # converted value must be one of the choices (if specified) + if action.choices is not None and value not in action.choices: + tup = value, ', '.join(map(repr, action.choices)) + msg = _('invalid choice: %r (choose from %s)') % tup + raise ArgumentError(action, msg) + + + + # ======================= + # Help-formatting methods + # ======================= + + def format_usage(self): + formatter = self._get_formatter() + formatter.add_usage(self.usage, + self._get_optional_actions(), + self._get_positional_actions()) + return formatter.format_help() + + def format_help(self): + formatter = self._get_formatter() + + # usage + formatter.add_usage(self.usage, + self._get_optional_actions(), + self._get_positional_actions()) + + # description + formatter.add_text(self.description) + + # positionals + formatter.start_section(_('positional arguments')) + formatter.add_arguments(self._positional_actions_list) + formatter.end_section() + + # optionals + formatter.start_section(_('optional arguments')) + formatter.add_arguments(self._optional_actions_list) + formatter.end_section() + + # user-defined groups + for argument_group in self._argument_groups: + formatter.start_section(argument_group.title) + formatter.add_text(argument_group.description) + formatter.add_arguments(argument_group._positional_actions_list) + formatter.add_arguments(argument_group._optional_actions_list) + formatter.end_section() + + # epilog + formatter.add_text(self.epilog) + + # determine help from format above + return formatter.format_help() + + def format_version(self): + formatter = self._get_formatter() + formatter.add_text(self.version) + return formatter.format_help() + + def _get_formatter(self): + return self.formatter_class(prog=self.prog) + + # ===================== + # Help-printing methods + # ===================== + + def print_usage(self, file=None): + self._print_message(self.format_usage(), file) + + def print_help(self, file=None): + self._print_message(self.format_help(), file) + + def print_version(self, file=None): + self._print_message(self.format_version(), file) + + def _print_message(self, message, file=None): + if message: + if file is None: + file = _sys.stderr + file.write(message) + + + # =============== + # Exiting methods + # =============== + + def exit(self, status=0, message=None): + if message: + _sys.stderr.write(message) + _sys.exit(status) + + def error(self, message): + """error(message: string) + + Prints a usage message incorporating the message to stderr and + exits. + + If you override this in a subclass, it should not return -- it + should either exit or raise an exception. + """ + self.print_usage(_sys.stderr) + self.exit(2, _('%s: error: %s\n') % (self.prog, message)) diff --git a/zfec/fec/util/fileutil.py b/zfec/fec/util/fileutil.py new file mode 100644 index 0000000..ef75cc6 --- /dev/null +++ b/zfec/fec/util/fileutil.py @@ -0,0 +1,186 @@ +# Copyright (c) 2000 Autonomous Zone Industries +# Copyright (c) 2002-2007 Bryce "Zooko" Wilcox-O'Hearn +# This file is licensed under the +# GNU Lesser General Public License v2.1. +# See the file COPYING or visit http://www.gnu.org/ for details. +# Portions snarfed out of the Python standard library. +# The du part is due to Jim McCoy. + +""" +Futz with files like a pro. +""" + +import exceptions, os, stat, tempfile, time + +try: + from twisted.python import log +except ImportError: + class DummyLog: + def msg(self, *args, **kwargs): + pass + log = DummyLog() + +def rename(src, dst, tries=4, basedelay=0.1): + """ Here is a superkludge to workaround the fact that occasionally on + Windows some other process (e.g. an anti-virus scanner, a local search + engine, etc.) is looking at your file when you want to delete or move it, + and hence you can't. The horrible workaround is to sit and spin, trying + to delete it, for a short time and then give up. + + With the default values of tries and basedelay this can block for less + than a second. + + @param tries: number of tries -- each time after the first we wait twice + as long as the previous wait + @param basedelay: how long to wait before the second try + """ + for i in range(tries-1): + try: + return os.rename(src, dst) + except EnvironmentError, le: + # XXX Tighten this to check if this is a permission denied error (possibly due to another Windows process having the file open and execute the superkludge only in this case. + log.msg("XXX KLUDGE Attempting to move file %s => %s; got %s; sleeping %s seconds" % (src, dst, le, basedelay,)) + time.sleep(basedelay) + basedelay *= 2 + return os.rename(src, dst) # The last try. + +def remove(f, tries=4, basedelay=0.1): + """ Here is a superkludge to workaround the fact that occasionally on + Windows some other process (e.g. an anti-virus scanner, a local search + engine, etc.) is looking at your file when you want to delete or move it, + and hence you can't. The horrible workaround is to sit and spin, trying + to delete it, for a short time and then give up. + + With the default values of tries and basedelay this can block for less + than a second. + + @param tries: number of tries -- each time after the first we wait twice + as long as the previous wait + @param basedelay: how long to wait before the second try + """ + try: + os.chmod(f, stat.S_IWRITE | stat.S_IEXEC | stat.S_IREAD) + except: + pass + for i in range(tries-1): + try: + return os.remove(f) + except EnvironmentError, le: + # XXX Tighten this to check if this is a permission denied error (possibly due to another Windows process having the file open and execute the superkludge only in this case. + if not os.path.exists(f): + return + log.msg("XXX KLUDGE Attempting to remove file %s; got %s; sleeping %s seconds" % (f, le, basedelay,)) + time.sleep(basedelay) + basedelay *= 2 + return os.remove(f) # The last try. + +class NamedTemporaryDirectory: + """ + This calls tempfile.mkdtemp(), stores the name of the dir in + self.name, and rmrf's the dir when it gets garbage collected or + "shutdown()". + """ + def __init__(self, cleanup=True, *args, **kwargs): + """ If cleanup, then the directory will be rmrf'ed when the object is shutdown. """ + self.cleanup = cleanup + self.name = tempfile.mkdtemp(*args, **kwargs) + + def __repr__(self): + return "<%s instance at %x %s>" % (self.__class__.__name__, id(self), self.name) + + def __str__(self): + return self.__repr__() + + def __del__(self): + try: + self.shutdown() + except: + import traceback + traceback.print_exc() + + def shutdown(self): + if self.cleanup and hasattr(self, 'name'): + rm_dir(self.name) + +def make_dirs(dirname, mode=0777, strictmode=False): + """ + A threadsafe and idempotent version of os.makedirs(). If the dir already + exists, do nothing and return without raising an exception. If this call + creates the dir, return without raising an exception. If there is an + error that prevents creation or if the directory gets deleted after + make_dirs() creates it and before make_dirs() checks that it exists, raise + an exception. + + @param strictmode if true, then make_dirs() will raise an exception if the + directory doesn't have the desired mode. For example, if the + directory already exists, and has a different mode than the one + specified by the mode parameter, then if strictmode is true, + make_dirs() will raise an exception, else it will ignore the + discrepancy. + """ + tx = None + try: + os.makedirs(dirname, mode) + except OSError, x: + tx = x + + if not os.path.isdir(dirname): + if tx: + raise tx + raise exceptions.IOError, "unknown error prevented creation of directory, or deleted the directory immediately after creation: %s" % dirname # careful not to construct an IOError with a 2-tuple, as that has a special meaning... + + tx = None + if hasattr(os, 'chmod'): + try: + os.chmod(dirname, mode) + except OSError, x: + tx = x + + if strictmode and hasattr(os, 'stat'): + s = os.stat(dirname) + resmode = stat.S_IMODE(s.st_mode) + if resmode != mode: + if tx: + raise tx + raise exceptions.IOError, "unknown error prevented setting correct mode of directory, or changed mode of the directory immediately after creation. dirname: %s, mode: %04o, resmode: %04o" % (dirname, mode, resmode,) # careful not to construct an IOError with a 2-tuple, as that has a special meaning... + +def rm_dir(dirname): + """ + A threadsafe and idempotent version of shutil.rmtree(). If the dir is + already gone, do nothing and return without raising an exception. If this + call removes the dir, return without raising an exception. If there is an + error that prevents deletion or if the directory gets created again after + rm_dir() deletes it and before rm_dir() checks that it is gone, raise an + exception. + """ + excs = [] + try: + os.chmod(dirname, stat.S_IWRITE | stat.S_IEXEC | stat.S_IREAD) + for f in os.listdir(dirname): + fullname = os.path.join(dirname, f) + if os.path.isdir(fullname): + rm_dir(fullname) + else: + remove(fullname) + os.rmdir(dirname) + except Exception, le: + # Ignore "No such file or directory" + if (not isinstance(le, OSError)) or le.args[0] != 2: + excs.append(le) + + # Okay, now we've recursively removed everything, ignoring any "No + # such file or directory" errors, and collecting any other errors. + + if os.path.exists(dirname): + if len(excs) == 1: + raise excs[0] + if len(excs) == 0: + raise OSError, "Failed to remove dir for unknown reason." + raise OSError, excs + + +def remove_if_possible(f): + try: + remove(f) + except: + pass diff --git a/zfec/fec/util/mathutil.py b/zfec/fec/util/mathutil.py new file mode 100644 index 0000000..de95a6a --- /dev/null +++ b/zfec/fec/util/mathutil.py @@ -0,0 +1,92 @@ +# Copyright (c) 2005-2007 Bryce "Zooko" Wilcox-O'Hearn +# mailto:zooko@zooko.com +# http://zooko.com/repos/pyutil +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this work to deal in this work without restriction (including the rights +# to use, modify, distribute, sublicense, and/or sell copies). + +""" +A few commonly needed functions. +""" + +import math + +def div_ceil(n, d): + """ + The smallest integer k such that k*d >= n. + """ + return (n/d) + (n%d != 0) + +def next_multiple(n, k): + """ + The smallest multiple of k which is >= n. + """ + return div_ceil(n, k) * k + +def pad_size(n, k): + """ + The smallest number that has to be added to n so that n is a multiple of k. + """ + if n%k: + return k - n%k + else: + return 0 + +def is_power_of_k(n, k): + return k**int(math.log(n, k) + 0.5) == n + +def next_power_of_k(n, k): + p = 1 + while p < n: + p *= k + return p + +def ave(l): + return sum(l) / len(l) + +def log_ceil(n, b): + """ + The smallest integer k such that b^k >= n. + + log_ceil(n, 2) is the number of bits needed to store any of n values, e.g. + the number of bits needed to store any of 128 possible values is 7. + """ + p = 1 + k = 0 + while p < n: + p *= b + k += 1 + return k + +def linear_fit_slope(ps): + """ + @param ps a sequence of tuples of (x, y) + """ + avex = ave([x for (x, y) in ps]) + avey = ave([y for (x, y) in ps]) + sxy = sum([ (x - avex) * (y - avey) for (x, y) in ps ]) + sxx = sum([ (x - avex) ** 2 for (x, y) in ps ]) + if sxx == 0: + return None + return sxy / sxx + +def permute(l): + """ + Return all possible permutations of l. + + @type l: sequence + @rtype a set of sequences + """ + if len(l) == 1: + return [l,] + + res = [] + for i in range(len(l)): + l2 = list(l[:]) + x = l2.pop(i) + for l3 in permute(l2): + l3.append(x) + res.append(l3) + + return res + diff --git a/zfec/fec/util/version.py b/zfec/fec/util/version.py new file mode 100644 index 0000000..d5cbb36 --- /dev/null +++ b/zfec/fec/util/version.py @@ -0,0 +1,129 @@ +# Copyright (c) 2004-2007 Bryce "Zooko" Wilcox-O'Hearn +# mailto:zooko@zooko.com +# http://zooko.com/repos/pyutil +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this work to deal in this work without restriction (including the rights +# to use, modify, distribute, sublicense, and/or sell copies). + +""" +extended version number class +""" + +from distutils import version + +# End users see version strings like this: + +# "1.0.0" +# ^ ^ ^ +# | | | +# | | '- micro version number +# | '- minor version number +# '- major version number + +# The first number is "major version number". The second number is the "minor +# version number" -- it gets bumped whenever we make a new release that adds or +# changes functionality. The third version is the "micro version number" -- it +# gets bumped whenever we make a new release that doesn't add or change +# functionality, but just fixes bugs (including performance issues). + +# Early-adopter end users see version strings like this: + +# "1.0.0a1" +# ^ ^ ^^^ +# | | ||| +# | | ||'- release number +# | | |'- alpha or beta (or none) +# | | '- micro version number +# | '- minor version number +# '- major version number + +# The optional "a" or "b" stands for "alpha release" or "beta release" +# respectively. The number after "a" or "b" gets bumped every time we +# make a new alpha or beta release. This has the same form and the same +# meaning as version numbers of releases of Python. + +# Developers see "full version strings", like this: + +# "1.0.0a1-55-UNSTABLE" +# ^ ^ ^^^ ^ ^ +# | | ||| | | +# | | ||| | '- tags +# | | ||| '- nano version number +# | | ||'- release number +# | | |'- alpha or beta (or none) +# | | '- micro version number +# | '- minor version number +# '- major version number + +# The next number is the "nano version number". It is meaningful only to +# developers. It gets bumped whenever a developer changes anything that another +# developer might care about. + +# The last part is the "tags" separated by "_". Standard tags are +# "STABLE" and "UNSTABLE". + +class Tag(str): + def __cmp__(t1, t2): + if t1 == t2: + return 0 + if t1 == "UNSTABLE" and t2 == "STABLE": + return 1 + if t1 == "STABLE" and t2 == "UNSTABLE": + return -1 + return -2 # who knows + +class Version: + def __init__(self, vstring=None): + if vstring: + self.parse(vstring) + + def parse(self, vstring): + i = vstring.find('-') + if i: + svstring = vstring[:i] + estring = vstring[i+1:] + else: + svstring = vstring + estring = None + + self.strictversion = version.StrictVersion(svstring) + + if estring: + try: + (self.nanovernum, tags,) = estring.split('-') + except: + print estring + raise + self.tags = map(Tag, tags.split('_')) + self.tags.sort() + + self.fullstr = '-'.join([str(self.strictversion), str(self.nanovernum), '_'.join(self.tags)]) + + def tags(self): + return self.tags + + def user_str(self): + return self.strictversion.__str__() + + def full_str(self): + return self.fullstr + + def __str__(self): + return self.full_str() + + def __repr__(self): + return self.__str__() + + def __cmp__ (self, other): + if isinstance(other, basestring): + other = Version(other) + + res = cmp(self.strictversion, other.strictversion) + if res != 0: + return res + + res = cmp(self.nanovernum, other.nanovernum) + if res != 0: + return res + + return cmp(self.tags, other.tags) diff --git a/zfec/setup.py b/zfec/setup.py new file mode 100755 index 0000000..7a44bc1 --- /dev/null +++ b/zfec/setup.py @@ -0,0 +1,69 @@ +#!/usr/bin/env python + +# zfec -- fast forward error correction library with Python interface +# +# Copyright (C) 2007 Allmydata, Inc. +# Author: Zooko Wilcox-O'Hearn +# mailto:zooko@zooko.com +# +# This file is part of zfec. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License +# as published by the Free Software Foundation; either version 2 +# of the License, or (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + +from distutils.core import Extension, setup + + +DEBUGMODE=False +# DEBUGMODE=True + +extra_compile_args=[] +extra_link_args=[] + +extra_compile_args.append("-std=c99") + +undef_macros=[] + +if DEBUGMODE: + extra_compile_args.append("-O0") + extra_compile_args.append("-g") + extra_compile_args.append("-Wall") + extra_link_args.append("-g") + undef_macros.append('NDEBUG') + +trove_classifiers=[ + "Development Status :: 4 - Beta", + "Environment :: No Input/Output (Daemon)", + "Intended Audience :: Developers", + "License :: OSI Approved :: GNU General Public License (GPL)", + "Natural Language :: English", + "Operating System :: OS Independent", + "Programming Language :: C", + "Programming Language :: Python", + "Topic :: System :: Archiving :: Backup", + ] + +setup(name='zfec', + version='1.0.0a1', + summary='Provides a fast C implementation of Reed-Solomon erasure coding with a Python interface.', + description='Erasure coding is the generation of redundant blocks of information such that if some blocks are lost ("erased") then the original data can be recovered from the remaining blocks. This package contains an optimized implementation along with a Python interface.', + author='Zooko O\'Whielacronx', + author_email='zooko@zooko.com', + url='http://www.allmydata.com/source/zfec', + license='GNU GPL', + platform='Any', + packages=['fec', 'fec.util', 'fec.test'], + classifiers=trove_classifiers, + ext_modules=[Extension('_fec', ['fec/fec.c', 'fec/_fecmodule.c',], extra_link_args=extra_link_args, extra_compile_args=extra_compile_args, undef_macros=undef_macros),], + )