cardinal_pythonlib.openxml.grep_in_openxml


Original code copyright (C) 2009-2022 Rudolf Cardinal (rudolf@pobox.com).

This file is part of cardinal_pythonlib.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


Performs a grep (global-regular-expression-print) search of files in OpenXML format, which is to say inside ZIP files. See the command-line help for details.

Version history:

  • Written 28 Sep 2017.

Notes:

  • use the vbindiff tool to show how two binary files differ.

cardinal_pythonlib.openxml.grep_in_openxml.main() None[source]

Command-line handler for the grep_in_openxml tool. Use the --help option for help.

cardinal_pythonlib.openxml.grep_in_openxml.parse_zip(zipfilename: str, regex: Pattern, invert_match: bool, files_with_matches: bool, files_without_match: bool, grep_inner_file_name: bool, show_inner_file: bool) None[source]

Implement a “grep within an OpenXML file” for a single OpenXML file, which is by definition a .zip file.

Parameters:
  • zipfilename – name of the OpenXML (zip) file

  • regex – regular expression to match

  • invert_match – find files that do NOT match, instead of ones that do?

  • files_with_matches – show filenames of files with a match?

  • files_without_match – show filenames of files with no match?

  • grep_inner_file_name – search the names of “inner” files, rather than their contents?

  • show_inner_file – show the names of the “inner” files, not just the “outer” (OpenXML) file?

cardinal_pythonlib.openxml.grep_in_openxml.report_hit_filename(zipfilename: str, contentsfilename: str, show_inner_file: bool) None[source]

For “hits”: prints either the .zip filename, or the .zip filename and the inner filename.

Parameters:
  • zipfilename – filename of the .zip file

  • contentsfilename – filename of the inner file

  • show_inner_file – if True, show both; if False, show just the .zip filename

Returns:

cardinal_pythonlib.openxml.grep_in_openxml.report_line(zipfilename: str, contentsfilename: str, line: str, show_inner_file: bool) None[source]

Prints a line from a file, with the .zip filename and optionally also the inner filename.

Parameters:
  • zipfilename – filename of the .zip file

  • contentsfilename – filename of the inner file

  • line – the line from the inner file

  • show_inner_file – if True, show both filenames; if False, show just the .zip filename

cardinal_pythonlib.openxml.grep_in_openxml.report_miss_filename(zipfilename: str) None[source]

For “misses”: prints the zip filename.