Hello, I'm looking forward to this tool! However, I'm getting errors when trying to run twextract - it looks like this is a unicode / character set problem? I'm running this on a debian etch system. I believe by default etch tries to use UTF-8... Error dump:
Traceback (most recent call last): File "/usr/bin/twextract", line 7, in ? sys.exit( File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/scripts/twextract.py", line 351, in do_main print "Got %d names" % len(istore.names()) File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw.py", line 39, in names self.ensure_cache() File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw.py", line 64, in ensure_cache for item in self.store.getall(): File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw_re.py", line 130, in getall return [self.parse_div(div) for div in self.get_all_divs()] File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw_re.py", line 339, in get_all_divs return self.get_store().split('</div>')[:-1] File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw_re.py", line 326, in get_store buf = self.loadfile() File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw_re.py", line 271, in loadfile return unicode(buf, 'utf-8') UnicodeDecodeError: 'utf8' codec can't decode byte 0xbb in position 12358: unexpected code byte
unicode problems with twextract?
Traceback (most recent call last): File "/usr/bin/twextract", line 7, in ? sys.exit( File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/scripts/twextract.py", line 351, in do_main print "Got %d names" % len(istore.names()) File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw.py", line 39, in names self.ensure_cache() File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw.py", line 64, in ensure_cache for item in self.store.getall(): File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw_re.py", line 130, in getall return [self.parse_div(div) for div in self.get_all_divs()] File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw_re.py", line 339, in get_all_divs return self.get_store().split('</div>')[:-1] File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw_re.py", line 326, in get_store buf = self.loadfile() File "/usr/lib/python2.4/site-packages/WikklyText-0.99.50-py2.4.egg/wikklytext/store/wikStore_tw_re.py", line 271, in loadfile return unicode(buf, 'utf-8') UnicodeDecodeError: 'utf8' codec can't decode byte 0xbb in position 12358: unexpected code byte
Thanks for any pointers,
Nathan