filter_file uses "surrogateescape" error handling (#12765)
From Python docs: -- 'surrogateescape' will represent any incorrect bytes as code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the surrogateescape error handler is used when writing data. This is useful for processing files in an unknown encoding. -- This will allow us to process files with unknown encodings. To accommodate the case of self-extracting bash scripts, filter_file can now stop filtering text input if a certain marker is found. The marker must be passed at call time via the "stop_at" function argument. At that point the file will be reopened in binary mode and copied verbatim. * use "surrogateescape" error handling to ignore unknown chars * permit to stop filtering if a marker is found * add unit tests for non-ASCII and mixed text/binary files
This commit is contained in:

committed by
Todd Gamblin

parent
3f46f03c83
commit
5cd28847e8
@@ -26,7 +26,8 @@ def install(self, spec, prefix):
|
||||
filter_file('INSTALL_DIR=~/.aspera',
|
||||
'INSTALL_DIR=%s' % prefix,
|
||||
runfile,
|
||||
string=True)
|
||||
string=True,
|
||||
stop_at='__ARCHIVE_FOLLOWS__')
|
||||
# Install
|
||||
chmod = which('chmod')
|
||||
chmod('+x', runfile)
|
||||
|
Reference in New Issue
Block a user